Identification and use of circulating tumor markers

ABSTRACT

Methods for creating a library of recurrently mutated genomic regions and for using the library to analyze cancer-specific and patient-specific genetic alterations in a patient are provided. The methods can be used to measure tumor-derived nucleic acids in patient blood and thus to monitor the progression of disease. The methods can also be used for cancer screening.

STATEMENT OF GOVERNMENTAL SUPPORT

This invention was made with government support under grant numberW81XWH-12-1-0285 awarded by the Department of Defense. The governmenthas certain rights in the invention.

BACKGROUND OF THE INVENTION

Analysis of cancer-derived cell-free DNA (cfDNA) has the potential torevolutionize detection and monitoring of cancer. Noninvasive access tomalignant DNA is particularly attractive for solid tumors, which cannotbe repeatedly sampled without invasive procedures. In non-small celllung cancer (NSCLC), PCR-based assays have been used previously todetect recurrent point mutations in genes such as KRAS or EGFR in plasmaDNA (Taniguchi et al. (2011) Clin. Cancer Res. 17:7808-7815; Gautschi etal. (2007) Cancer Lett. 254:265-273; Kuang et al. (2009) Clin. CancerRes. 15:2630-2636; Rosell et al. (2009) N. Engl. J. Med. 361:958-967),but the majority of patients lack mutations in these genes. Otherstudies have proposed identifying patient-specific chromosomalrearrangements in tumors via whole genome sequencing (WGS), followed bybreakpoint qPCR from cfDNA (Leary et al. (2010) Sci. Transl. Med.2:20ra14; McBride et al. (2010) Genes Chrom. Cancer 49:1062-1069). Whilesensitive, such methods require optimization of molecular assays foreach patient, limiting their widespread clinical application. Morerecently, several groups have reported amplicon-based deep sequencingmethods to detect cfDNA mutations in up to 6 recurrently mutated genes(Forshew et al. (2012) Sci. Transl. Med. 4:136ra168; Narayan et al.(2012) Cancer Res. 72:3492-3498; Kinde et al. (2011) Proc. Natl Acad.Sci. USA 108:9530-9535). While powerful, these approaches are limited bythe number of mutations that can be interrogated (Rachlin et al. (2005)BMC Genomics 6:102) and the inability to detect genomic fusions.

PCT International Patent Publication No. 2011/103236 describes methodsfor identifying personalized tumor markers in a cancer patient using“mate-paired” libraries. The methods are limited to monitoring somaticchromosomal rearrangements, however, and must be personalized for eachpatient, thus limiting their applicability and increasing their cost.

U.S. Patent Application Publication No. 2010/0041048 A1 describes thequantitation of tumor-specific cell-free DNA in colorectal cancerpatients using the “BEAMing” technique (Beads, Emulsion, Amplification,and Magnetics). While this technique provides high sensitivity andspecificity, this method is for single mutations and thus any givenassay can only be applied to a subset of patients and/or requirespatient-specific optimization. U.S. Patent Application Publication No.2012/0183967 A1 describes additional methods to identify and quantifygenetic variations, including the analysis of minor variants in a DNApopulation, using the “BEAMing” technique.

U.S. Patent Application Publication No. 2012/0214678 A1 describesmethods and compositions for detecting fetal nucleic acids anddetermining the fraction of cell-free fetal nucleic acid circulating ina maternal sample. While sensitive, these methods analyze polymorphismsoccurring between maternal and fetal nucleic acids rather thanpolymorphisms that result from somatic mutations in tumor cells. Inaddition, methods that detect fetal nucleic acids in maternalcirculation require much less sensitivity than methods that detect tumornucleic acids in cancer patient circulation, because fetal nucleic acidsare much more abundant than tumor nucleic acids.

U.S. Patent Application Publication Nos. 2012/0237928 A1 and2013/0034546 describe methods for determining copy number variations ofa sequence of interest in a test sample comprising a mixture of nucleicacids. While potentially applicable to the analysis of cancer, thesemethods are directed to measuring major structural changes in nucleicacids, such as translocations, deletions, and amplifications, ratherthan single nucleotide variations.

U.S. Patent Application Publication No. 2012/0264121 A1 describesmethods for estimating a genomic fraction, for example, a fetalfraction, from polymorphisms such as small base variations orinsertions-deletions. These methods do not, however, make use ofoptimized libraries of polymorphisms, such as, for example, librariescontaining recurrently-mutated genomic regions.

U.S. Patent Application Publication No. 2013/0024127 A1 describescomputer-implemented methods for calculating a percent contribution ofcell-free nucleic acids from a major source and a minor source in amixed sample. The methods do not, however, provide any advantages inidentifying or making use of optimized libraries of polymorphisms in theanalysis.

PCT International Publication No. WO 2010/141955 A2 describes methods ofdetecting cancer by analyzing panels of genes from a patient-obtainedsample and determining the mutational status of the genes in the panel.The methods rely on a relatively small number of known cancer genes,however, and they do not provide any ranking of the genes according toeffectiveness in detection of relevant mutations. In addition, themethods were unable to detect the presence of mutations in the majorityof serum samples from actual cancer patients.

There is thus a need for new and improved methods to detect and monitortumor-related nucleic acids in cancer patients.

SUMMARY OF THE INVENTION

The present invention addresses these and other problems by providingnovel methods and systems relating to the characterization, diagnosis,and monitoring of cancer. In particular, according to one aspect, theinvention provides methods for creating a library of recurrently mutatedgenomic regions comprising:

identifying a plurality of genomic regions from a group of genomicregions that are recurrently mutated in a specific cancer;

wherein the library comprises the plurality of genomic regions;

the plurality of genomic regions comprises at least 10 different genomicregions; and

at least one mutation within the plurality of genomic regions is presentin at least 60% of all subjects with the specific cancer.

In specific embodiments of these methods, the plurality of genomicregions comprises at least 25, at least 50, at least 100, at least 150,at least 200, or at least 500 different genomic regions.

In other specific method embodiments, at least two mutations within theplurality of genomic regions or at least three mutations within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer.

In still other specific method embodiments, at least one mutation withinthe plurality of genomic regions is present in at least 60%, 70%, 80%,90%, 95%, 98%, 99%, or 99.9% of all subjects with the specific cancer.

In some embodiments, the identifying step comprises for each genomicregion in the plurality of genomic regions, ranking the genomic regionto maximize the number of all subjects with the specific cancer havingat least one mutation within the genomic region.

In other embodiments, the identifying step comprises for each genomicregion in the plurality of genomic regions, ranking the genomic regionto maximize the ratio between the number of all subjects with thespecific cancer having at least one mutation within the genomic regionand the length of the genomic region.

In some embodiments, the library comprises a plurality of genomicregions encoding a plurality of driver sequences, more specificallyknown driver sequences or driver sequences that are recurrently mutatedin the specific cancer.

In some embodiments, the library comprises a plurality of genomicregions that are recurrently rearranged in the specific cancer.

In preferred embodiments, the specific cancer is a carcinoma, and inmore preferred embodiments, the carcinoma is an adenocarcinoma, anon-small cell lung cancer, or a squamous cell carcinoma.

In specific embodiments, the cumulative length of the plurality ofgenomic regions is at most 30 Mb, 20 Mb, 10 Mb, 5 Mb, 2 Mb, 1 Mb, 500kb, 200 kb, 100 kb, 50 kb, 20 kb, or 10 kb.

In another aspect, the invention provides methods for analyzing acancer-specific genetic alteration in a subject comprising the steps of:

obtaining a tumor nucleic acid sample and a genomic nucleic acid samplefrom a subject with a specific cancer;

sequencing a plurality of target regions in the tumor nucleic acidsample and in the genomic nucleic acid sample to obtain a plurality oftumor nucleic acid sequences and a plurality of genomic nucleic acidsequences; and

comparing the plurality of tumor nucleic acid sequences to the pluralityof genomic nucleic acid sequences to identify a patient-specific geneticalteration in the tumor nucleic acid sample;

wherein the plurality of target regions are selected from a plurality ofgenomic regions that are recurrently mutated in the specific cancer;

the plurality of genomic regions comprises at least 10 different genomicregions; and

at least one mutation within the plurality of genomic regions is presentin at least 60% of all subjects with the specific cancer.

In specific embodiments of this aspect of the invention, the pluralityof genomic regions comprises at least 25, at least 50, at least 100, atleast 150, at least 200, or at least 500 different genomic regions.

In other specific embodiments, at least two mutations within theplurality of genomic regions or at least three mutations within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer.

In still other specific embodiments, at least one mutation within theplurality of genomic regions is present in at least 60%, 70%, 80%, 90%,95%, 98%, 99%, or 99.9% of all subjects with the specific cancer.

In some embodiments, each genomic region in the plurality of genomicregions is identified by ranking the genomic region to maximize thenumber of all subjects with the specific cancer having at least onemutation within the genomic region.

In other embodiments, each genomic region in the plurality of genomicregions is identified by ranking the genomic region to maximize theratio between the number of all subjects with the specific cancer havingat least one mutation within the genomic region and the length of thegenomic region.

In some embodiments, the plurality of genomic regions comprises genomicregions encoding a plurality of driver sequences, more specificallyknown driver sequences or driver sequences that are recurrently mutatedin the specific cancer.

In some embodiments, the plurality of genomic regions comprises genomicregions that are recurrently rearranged in the specific cancer.

In preferred embodiments, the specific cancer is a carcinoma, and inmore preferred embodiments, the carcinoma is an adenocarcinoma, anon-small cell lung cancer, or a squamous cell carcinoma.

In specific embodiments, the cumulative length of the plurality ofgenomic regions is at most 30 Mb, 20 Mb, 10 Mb, 5 Mb, 2 Mb, 1 Mb, 500kb, 200 kb, 100 kb, 50 kb, 20 kb, or 10 kb.

In some embodiments, the methods further comprising the steps of:

obtaining a cell-free nucleic acid sample from the subject; and

identifying the patient-specific genetic alteration in the cell-freenucleic acid sample.

In specific embodiments, the step of identifying the patient-specificgenetic alteration in the cell-free nucleic acid sample comprisessequencing a genomic region comprising the patient-specific geneticalteration in the cell-free sample.

In other specific embodiments, the step of obtaining a tumor nucleicacid sample and a genomic nucleic acid sample comprises the step ofenriching the plurality of target regions in the tumor nucleic acidsample and the genomic nucleic acid sample, and in more specificembodiments, the enriching step comprises use of a custom library ofbiotinylated DNA.

In still other specific embodiments, the step of obtaining a cell-freenucleic acid sample comprises the step of enriching the plurality oftarget regions in the cell-free nucleic acid sample, and in still morespecific embodiments, the enriching step comprises use of a customlibrary of biotinylated DNA.

In some embodiments, the methods further comprise the step ofquantifying the cancer-specific genetic alteration in the cell-freesample.

In yet another aspect, the invention provides methods for screening acancer-specific genetic alteration in a subject comprising the steps of:

obtaining a cell-free nucleic acid sample from a subject;

sequencing a plurality of target regions in the cell-free sample toobtain a plurality of cell-free nucleic acid sequences; and

identifying a cancer-specific genetic alteration in the cell-freesample;

wherein the plurality of target regions are selected from a plurality ofgenomic regions that are recurrently mutated in the specific cancer;

the plurality of genomic regions comprises at least 10 different genomicregions; and

at least one mutation within the plurality of genomic regions is presentin at least 60% of all subjects with the specific cancer.

In specific embodiments, the plurality of genomic regions comprises atleast 25, at least 50, at least 100, at least 150, at least 200, or atleast 500 different genomic regions.

In other specific embodiments, at least two mutations within theplurality of genomic regions or at least three mutations within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer.

In still other specific embodiments, at least one mutation within theplurality of genomic regions is present in at least 60%, 70%, 80%, 90%,95%, 98%, 99%, or 99.9% of all subjects with the specific cancer.

In particular embodiments, each genomic region in the plurality ofgenomic regions is identified by ranking the genomic region to maximizethe number of all subjects with the specific cancer having at least onemutation within the genomic region.

In other particular embodiments, each genomic region in the plurality ofgenomic regions is identified by ranking the genomic region to maximizethe ratio between the number of all subjects with the specific cancerhaving at least one mutation within the genomic region and the length ofthe genomic region.

In still other particular embodiments, the plurality of genomic regionscomprises genomic regions encoding a plurality of driver sequences, and,more particularly, the driver sequences are known driver sequences orare recurrently mutated in the specific cancer.

In yet still other particular embodiments, the plurality of genomicregions comprises genomic regions that are recurrently rearranged in thespecific cancer.

In some embodiments, the specific cancer is a carcinoma, including, forexample, an adenocarcinoma, a non-small cell lung cancer, or a squamouscell carcinoma.

In specific embodiments, the cumulative length of the plurality ofgenomic regions is at most 30 Mb, 20 Mb, 10 Mb, 5 Mb, 2 Mb, 1 Mb, 500kb, 200 kb, 100 kb, 50 kb, 20 kb, or 10 kb.

In other specific embodiments, the step of obtaining a cell-free nucleicacid sample comprises the step of enriching the plurality of targetregions in the cell-free nucleic acid sample, and, in some embodiments,the enriching step comprises use of a custom library of biotinylatedDNA.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Development of CAncer Personalized Profiling by Deep Sequencing(CAPP-Seq). (a) Schematic depicting design of CAPP-Seq selectors andtheir application for assessing circulating tumor DNA. (b) Multi-phasedesign of the NSCLC CAPP-Seq selector. (c) Analysis of the number ofSNVs per lung adenocarcinoma covered by the NSCLC CAPP-Seq selector inthe TCGA WES cohort (Training; N=229) and an independent lungadenocarcinoma WES data set (Validation; N=183) (Imielinski et al.(2012) Cell 150:1107-1120). (d) Number of SNVs per patient identified bythe NSCLC CAPP-Seq selector in WES data from three adenocarcinomas fromTCGA, colon (COAD), rectal (READ), and endometrioid (UCEC) cancers.(e-f) Quality parameters from a representative CAPP-Seq analysis ofplasma cfDNA, including length distribution of sequenced cfDNA fragments(e), and depth of sequencing coverage across all genomic regions in theselector (f). (g) Variation in sequencing depth across cfDNA samplesfrom 4 patients.

FIG. 2. CAPP-Seq computational pipeline. Major steps of thebioinformatics pipeline for mutation discovery and quantitation inplasma are schematically illustrated.

FIG. 3. Statistical enrichment of recurrently mutated NSCLC exonscaptures known drivers.

FIG. 4. Development of the FACTERA algorithm. Major steps used byFACTERA (see Detailed Methods) to precisely identify genomic breakpointsfrom aligned paired-end sequencing data are anecdotally illustratedusing two hypothetical genes, w and v. (a) Improperly paired, or“discordant,” reads (indicated in yellow) are used to locate genesinvolved in a potential fusion (in this case, w and v). (b) Becausetruncated (i.e., soft-clipped) reads may indicate a fusion breakpoint,any such reads within genomic regions delineated by w and v are alsofurther analyzed. (c) Consider soft-clipped reads, R1 and R2, whosenon-clipped segments map to w and v, respectively. If R1 and R2 derivefrom a fragment encompassing a true fusion between w and v, then themapped portion of R1 should match the soft-clipped portion of R2, andvice versa. This is assessed by FACTERA using fast k-mer indexing andcomparison. (d) Four possible orientations of R1 and R2 are depicted.However, only Cases 1a and 2a can generate valid fusions (see DetailedMethods). Thus, prior to k-mer comparison (panel c), the reversecomplement of R1 is taken for Cases 1b and 2b, respectively, convertingthem into Cases 1a and 2a. (e) In some cases, short sequencesimmediately flanking the breakpoint are identical, preventingunambiguous determination of the breakpoint. Let iterators i and jdenote the first matching sequence positions between R1 and R2. Toreconcile sequence overlap, FACTERA arbitrarily adjusts the breakpointin R2 (i.e., bp2) to match R1 (i.e., bp1) using the sequence offsetdetermined by differences in distance between bp2 and i, and bp1 and j.Two cases are illustrated, corresponding to sequence orientationsdescribed in (d).

FIG. 5. Application of FACTERA to NSCLC cell lines NCI-H3122 and HCC78,and Sanger-validation of breakpoints. (a) Pile-up of a subset ofsoft-clipped reads mapping to the EML4-ALK fusion identified inNCI-H3122 along with the corresponding Sanger chromatogram. (b) Same as(a), but for the SLC34A2-ROS1 translocation identified in HCC78.

FIG. 6. Improvements in CAPP-Seq performance with optimized librarypreparation procedures.

FIG. 7. Optimizing allele recovery from low input cfDNA during Illuminalibrary preparation.

FIG. 8. CAPP-Seq performance with various amounts of input cfDNA.

FIG. 9. Analysis of CAPP-Seq background, allele detection threshold, andlinearity. (a) Analysis of background rate for 6 NSCLC patient plasmasamples and a healthy individual (Detailed Methods). (b) Analysis ofbiological background in (a) focusing on 107 recurrent somatic mutationsfrom a previously reported SNaPshot panel (Su et al. (2011) J. Mol.Diagn. 13:74-84). Mutations found in a given patient's tumor wereexcluded. The mean frequency for each patient (horizontal red line) waswithin confidence limits of the mean background limit of 0.007%(horizontal blue line; panel a). A single outlier mutation (TP53 R175H)is indicated by an orange diamond. (c) Individual mutations from (b)ranked by most to least recurrent, according to median frequency acrossthe 7 samples. (d) Dilution series analysis of expected versus observedfrequencies of mutant alleles using CAPP-Seq. Dilution series weregenerated by spiking fragmented HCC78 DNA into control cfDNA. (e)Analysis of the effect of the number of SNVs considered on the estimatesof fractional abundance (95% confidence intervals shown in gray). (f)Analysis of the effect of the number of SNVs considered on the meancorrelation coefficient between expected and observed cancer fractions(blue dashed line) using data from panel (d). 95% confidence intervalsare shown for (a)-(c). Statistical variation for (d) is shown as s.e.m.

FIG. 10. Empirical spiking analysis of CAPP-Seq using two NSCLC celllines. (a) Expected and observed (by CAPP-Seq) fractions of NCI-H3122DNA spiked into control HCC78 DNA are linear for all fractions tested(0.1%, 1%, and 10%; R²=1). (b) Using data from (a), analysis of theeffect of the number of SNVs considered on the estimates of fractionalabundance (95% confidence intervals shown in gray). (c) Analysis of theeffect of the number of SNVs considered on the mean correlationcoefficient and coefficient of variation between expected and observedcancer fractions (dashed lines) using data from panel (a). (d) Expectedand observed fractions of the EML4-ALK fusion present in HCC78 arelinear (R²=0.995) over all spiking concentrations tested (see FIG. 5( b)for breakpoint verification). The observed EML4-ALK fractions werenormalized based on the relative abundance of the fusion in 100% H3122DNA (see Detailed Methods for details). Moreover, a single heterozygousinsertion (indel) discovered within the selector space of NCI-H3122(chr7: 107416855, +T) was concordant with defined concentrations (shownare observed fractions adjusted for zygosity).

FIG. 11. Application of CAPP-Seq for noninvasive detection andmonitoring of circulating tumor DNA. (a) Characteristics of 11 patientsincluded in this study (Table 3). P-values reflect a two-sided pairedt-test for patients with reporter SNVs detected at both time points;other p-values were determined as described in Methods. ND, mutant DNAwas not detected above background. Dashes, plasma sample not available.Smoking history, ≧20 pack years (heavy), >0 pack years (light). (b-d)Disease monitoring using CAPP-Seq. Mutant allele frequencies (lefty-axis) and absolute concentrations (right y-axis) are shown. The lowerlimit of detection (defined in FIG. 2( a)-(b)) is indicated by thedashed lines. (b) Pre- and post-surgery circulating tumor DNA levelsquantified by CAPP-Seq in a Stage IB and a Stage IIIA NSCLC patient.Complete resections were achieved in both cases. (c) Disease burdenchanges in response to chemotherapy in a Stage IV NSCLC patient withthree rearrangement breakpoints identified by CAPP-Seq. Tumor volumebased on CT measurements and CAPP-Seq mutant allele frequencies areshown. Tu, tumor; Ef, pleural effusion. (d) Detection and monitoring ofa subclonal EGFR T790M resistance mutation in a patient with Stage IVNSCLC. The fractional abundance of the dominant clone andT790M-containing clone are shown in the primary tumor (left) and plasmasamples (right). (e) Predicted transcripts of three fusion genesdetected in case P9. (f) Statistically significant co-occurrence of ROS1fusions and U2AF1 S34F mutations in NSCLC (P=0.0019; two-sided Fisher'sexact test). (g) Exploratory analysis of the potential application ofCAPP-Seq for cancer screening. Pre-treatment plasma samples from panel(a) and a plasma sample from a healthy individual were examined for thepresence of mutant allele outliers without knowledge of the primarytumor mutations (see Detailed Methods). Error bars represent s.e.m.

FIG. 12. Base-pair resolution breakpoint mapping for all patients andcell lines enumerated by FACTERA. Gene fusions involving ALK (a) andROS1 (b) are graphically depicted. Schematics in the top panels indicatethe exact genomic positions (HG19 NCBI Build 37.1/GRCh37) of thebreakpoints in ALK, ROS1, EML4, KIF5B, SLC34A2, CD74, MKX, and FYN.Bottom panels depict exons flanking the predicted gene fusions withnotation indicating the 5′ fusion partner gene and last fused exonfollowed by the 3′ fusion partner gene and first fused exon. Forexample, in S13del37; R34 exons 1-13 of SLC34A2 (excluding the 3′ 37nucleotides of exon 13) are fused to exons 34-43 of ROS1. Exons in FYNare from its 5′UTR and precede the first coding exon. The green dottedline in the predicted FYN-ROS1 fusion indicates the first in-framemethionine in ROS1 exon 33, which preserves an open reading frameencoding the ROS1 kinase domain. All rearrangements were eachindependently confirmed by PCR and/or FISH.

FIG. 13. Presence of fusions is inversely related to the number of SNVsdetected by CAPP-Seq. For each patient listed in FIG. 11( a) the numberof identified SNVs versus the presence or absence of detected genomicfusions are plotted. The shading of the symbols is identical to FIG. 11(a), and indicates smoking history. Statistical significance wasdetermined using a two-sided Wilcoxon rank sum test, and error barsindicate s.e.m.

FIG. 14. Different types of reporters are similarly useful for diseasemonitoring. Three SNVs and an ALK translocation identified in patient 6are concordant at each time point, showing a comparable drop infractional abundance after treatment with the ALK kinase inhibitorCrizotinib. Due to small differences in measured allele frequencies ateach time point, linear regression was used to fit all allelefrequencies to their adjusted mutant cfDNA concentrations (R²=0.93).Thus, the scale on the right y-axis is interpolated. To accuratelyquantify disease burden, translocation and SNV frequencies were adjustedbased on differences in zygosity and sequencing depth in the tumorsample (see Detailed Methods).

FIG. 15. Flow cytometry-analysis of P9 pleural effusion. Flow cytometryof cryopreserved cells from a pleural effusion revealed only 0.22% ofcells stained positive for the epithelial marker, EpCAM, and negativefor the lineage markers CD31 (endothelial cells) and CD45 (immunecells). FACS was used to enrich tumor cells and analysis oftumor-enriched genomic DNA identified 3 fusions (FIG. 11( e)), whileunsorted low purity tumor specimen hampered de novo fusion discoveryusing FACTERA (Detailed Methods).

FIG. 16. Analysis of RNA-Seq data from lung adenocarcinoma patients inTCGA identifies 2 candidate cases with ROS1 rearrangements. (a) ROS1fusions are known to result in over-expression of the C-terminal kinasedomain, and breakpoints typically occur downstream of exon 31 (Bergethonet al. (2012) J. Clin. Oncol. 30:863-870; Rikova et al. (2007) Cell131:1190-1203; Takeuchi et al. (2012) Nat. Med. 18:378-381). Exon-levelRPKM values for ROS1 are plotted for 163 LUAD patients. Two patients(TCGA-05-4426 and TCGA-64-1680) have expression patterns suggestive ofROS1 fusions. (b,c) Pileups of RNA-Seq reads in these two patientsillustrate an abundance of reads mapping to regions surrounding ROS1exon boundaries. Colored reads indicate discordant pairs, consistentwith ROS1 fusions. Such pairs map to SLC34A2 for patient TCGA-05-4426(b) and CD74 for patient TCGA-64-1680 (c). A single soft-clipped RNA-Seqread supports a ROS1-CD74 fusion event in TCGA-64-1680.

FIG. 17. Non-invasive cancer screening with CAPP-Seq, related to FIG.11( g). (a) Steps to identify candidate SNVs in plasma cfDNAdemonstrated using a patient sample with NSCLC (P6, see Table 3).Following stepwise filtration, outlier detection is applied (DetailedMethods). (b) Same as (a), but using a plasma cfDNA sample from apatient who had their tumor surgically removed. No SNVs are identified,as expected. (c) Three additional representative samples applyingretrospective screening to patients analyzed in this study. P2 and P5samples have confirmed tumor-derived SNVs, while P9 is cancer positivebut lacks tumor-derived SNVs. Red points, confirmed tumor-derived SNVs;Green points, background noise.

DETAILED DESCRIPTION OF THE INVENTION

Tumors continually shed DNA into the circulation, where it is readilyaccessible. Stroun et al. (1987) Eur J Cancer Clin Oncol 23:707-712.Provided herein are methods for the ultrasensitive detection ofcirculating tumor DNA called CAncer Personalized Profiling by DeepSequencing (CAPP-Seq). Also provided are methods for creating librariesof recurrently mutated genomic regions used in the CAPP-Seq methods.CAPP-Seq targets hundreds of recurrently mutated genomic regions andsimultaneously detects point mutations, insertions/deletions, andrearrangements. CAPP-Seq for non-small cell lung cancer has beendemonstrated herein with a design that identified mutations in >95% oftumors. CAPP-Seq accurately quantified circulating tumor DNA from earlyand advanced stage tumors and identified mutant alleles down to 0.025%with a detection limit of <0.01%. Tumor-derived DNA levels paralleledclinical responses to diverse therapies and CAPP-Seq identifiedactionable mutations in plasma. Moreover, CAPP-Seq identifiedsignificant co-occurrence of ROS1 translocations with U2AF1 splicingfactor mutations. Finally, the utility of CAPP-Seq for cancer screeningis also described. CAPP-Seq can be routinely applied to noninvasivelydetect and monitor tumors, thus facilitating personalized cancertherapy.

Methods for Creating Libraries

According to one aspect of the invention, methods for creating a libraryof recurrently mutated genomic regions are provided. The methodscomprise the step of identifying a plurality of genomic regions from agroup of genomic regions that are recurrently mutated in a specificcancer, wherein the library comprises the plurality of genomic regions,the plurality of genomic regions comprises at least 10 different genomicregions, and at least one mutation within the plurality of genomicregions is present in at least 60% of all subjects with the specificcancer.

It should be understood that the term “library” represents a compilationor collection of individual components. Thus, a library of recurrentlymutated genomic regions is a compilation or collection of recurrentlymutated genomic regions. The libraries of the instant disclosure areuseful because they include a large number of potentially mutatedgenomic regions within a minimal length of genomic sequence. Use ofthese libraries to identify genetic alternations in specific patientsamples is particularly advantageous because the libraries do not needto be optimized on a patient-by-patient basis.

The libraries created according to the instant methods comprise genomicregions that are recurrently mutated in a specific cancer. Theidentification of these recurrent mutations benefits greatly from theavailability of databases such as, for example, The Cancer Genome Atlas(TCGA) and its subsets (http://cancergenome.nih.gov/). Such databasesserve as the starting point for identifying the recurrently mutatedgenomic regions of the instant libraries. The databases also provide asample of mutations occurring within a given percentage of subjects witha specific cancer.

The libraries created according to the instant methods comprise aplurality of genomic regions, wherein the plurality of genomic regionscomprises at least 10 different genomic regions. In some embodiments,the plurality of genomic regions comprises at least 25, at least 50, atleast 100, at least 150, at least 200, at least 500, or even moredifferent genomic regions.

It should be understood that the inclusion of larger numbers of genomicregions generally increases the likelihood that a unique mutation willbe identified to distinguish tumor nucleic acid in a subject from thesubject's genomic nucleic acid. Including too many genomic regions inthe library is not without a cost, however, since the number of genomicregions is directly related to the length of nucleic acids that must besequenced in the analysis. At the extreme, the entire genome of a tumorsample and a genomic sample could be sequenced, and the resultingsequences could be compared to note any differences. Such a brute forceapproach is not possible, however, with the vanishingly small quantitiesof tumor nucleic acid present in a cell-free sample.

The libraries of the instant disclosure address this problem byidentifying genomic regions that are recurrently mutated in a particularcancer, and then ranking those regions to maximize the likelihood thatthe region will include a distinguishing genetic alteration in aparticular tumor. The library of recurrently mutated genomic regions, or“selectors”, can be used across an entire population for a given cancer,and does not need to be optimized for each subject.

The term “mutation”, as used herein, refers to a genetic alteration inthe genome of an organism, specifically to a change in the nucleotidesequence of the organism. Examples of mutations include point mutations,where a single nucleotide is changed in the genome, and larger-scalechanges in the genome, such as rearrangements, insertions, deletions,and amplifications. A recurrent mutation is a mutation that has beenidentified in more than one individual.

The terms “patient” and “subject” are used interchangeably. These aretypically individuals that suffer from the cancer of interest. While theindividuals are typically human individuals, the methods and systems ofthe instant disclosure could also be applied to other species, inparticular, to other animal species, for example, livestock animals andpets.

The libraries of recurrently mutated genomic regions disclosed hereinare created for a given type of cancer using one or more of thefollowing design phases:

Phase 1: Identify known “driver” genes, i.e., genes that are known to bemutated frequently in the particular cancer.Phase 2: Maximize patient coverage by selecting genomic regions thatcontain recurrent mutations in multiple subjects with the particularcancer and ranking those selections to maximize the number of patientsidentified by mutations in those regions.Phases 3 and 4: Further ranking of genomic regions containing recurrentmutations by maximizing the “recurrence index”.Phase 5: Add genomic regions from genes predicted to harbor “driver”mutations in the particular cancer.Phase 6: Add genomic regions covering fusions and their flankingregions.

It should be understood, however, that the above-described phases ofselector design are independent of one another and may be appliedseparately or in a different order within the methods of librarycreating and still achieve the desired result.

Application of the above approaches for recurrently mutated genomicregions in non-small cell lung cancer results in the library shown inTable 1. All genomic regions included in the selector, along with theircorresponding HUGO gene symbols and genomic coordinates, as well aspatient statistics for NSCLC and a variety of other cancers, are shown,organized by selector design phase. The percentage of coverage of NSCLCpatients as the Table 1 library was developed is shown in FIG. 1( b).Also shown in the bottom panel of this figure is the cumulative lengthof genomic regions (in kb) as the library is created according to theabove phasing. The three curves in the top panel show percentagecoverage of patients with at least one distinguishing mutation betweentumor and genomic sequences (≧1 SNVs), at least two distinguishingmutations between tumor and genomic sequences (≧2 SNVs), and at leastthree distinguishing mutations between tumor and genomic sequences (≧3SNVs). As is apparent from these graphs, the library created accordingto the instant methods identifies genomic regions that are highly likelyto include identifiable mutations in tumor sequences. This libraryincludes a relatively small total number of genomic regions and thus arelatively short cumulative length of genomic regions and yet provides ahigh overall coverage of likely mutations in a population. The librarydoes not, therefore, need to be optimized on a patient-by-patient basis.The relatively short cumulative length of genomic regions also meansthat the analysis of cancer-derived cell-free DNA using these librariesis highly sensitive and allows the sequencing of this DNA to a greatdepth.

Accordingly, the libraries of recurrently mutated genomic regionscreated using the instant methods comprise a plurality of genomicregions that are recurrently mutated in a specific cancer, and theplurality of genomic regions comprises at least 10 different genomicregions. In some embodiments, the plurality of genomic regions comprisesat least 25 different genomic regions. In some embodiments, theplurality of genomic regions comprises at least 50 different genomicregions. In some embodiments, the plurality of genomic regions comprisesat least 100 different genomic regions. In some embodiments, theplurality of genomic regions comprises at least 150 different genomicregions. In some embodiments, the plurality of genomic regions comprisesat least 200 different genomic regions. In some embodiments, theplurality of genomic regions comprises at least 500 different genomicregions or even more.

In some embodiments, the plurality of genomic regions comprises at most5000 different genomic regions. In some embodiments, the plurality ofgenomic regions comprises at most 2000 different genomic regions. Insome embodiments, the plurality of genomic regions comprises at most1000 different genomic regions. In some embodiments, the plurality ofgenomic regions comprises at most 500 different genomic regions. In someembodiments, the plurality of genomic regions comprises at most 200different genomic regions. In some embodiments, the plurality of genomicregions comprises at most 150 different genomic regions. In someembodiments, the plurality of genomic regions comprises at most 100different genomic regions. In some embodiments, the plurality of genomicregions comprises at most 50 different genomic regions or even fewer.

Importantly, the libraries of recurrently mutated genomic regionscreated according to the instant methods enable the identification ofpatient- and tumor-specific mutations within the genomic regions in ahigh percentage of subjects. Specifically, in these libraries, at leastone mutation within the plurality of genomic regions is present in atleast 60% of all subjects with the specific cancer. In some embodiments,at least two mutations within the plurality of genomic regions arepresent in at least 60% of all subjects with the specific cancer. Inspecific embodiments, at least three mutations, or even more, within theplurality of genomic regions are present in at least 60% of all subjectswith the specific cancer.

In some embodiments, in the libraries of recurrently mutated genomicregions created according to these methods, at least one mutation withinthe plurality of genomic regions is present in at least 60%, 70%, 80%,90%, 95%, 98%, 99%, 99.9% or even higher percentages of all subjectswith the specific cancer.

In specific embodiments, at least two mutations within the plurality ofgenomic regions are present in at least 60%, 70%, 80%, 90%, 95%, 98%,99%, 99.9% or even higher percentages of all subjects with the specificcancer.

In more specific embodiments, at least three mutations, or even more,within the plurality of genomic regions are present in at least 60%,70%, 80%, 90%, 95%, 98%, 99%, 99.9% or even higher percentages of allsubjects with the specific cancer.

As previously noted, the cumulative length of genomic regions in thelibraries of recurrently mutated genomic regions created according tothe instant methods are relatively short, thus minimizing sequencingcosts associated with the analytical methods relying on these librariesand maximizing their sensitivity. In some embodiments, the cumulativelength of genomic regions is at most 30 megabases (Mb). In someembodiments, the cumulative length of genomic regions is at most 20 Mb,10 Mb, 5 Mb, 2 Mb, or 1 Mb. In some embodiments, the cumulative lengthof genomic regions is at most 500 kilobases (kb), 200 kb, 100 kb, 50 kb,20 kb, 10 kb, or even fewer.

In some embodiments, the library of recurrently mutated genomic regionscreated according to the instant methods comprises the genomic regionsdisplayed in Table 1, or a subset of those genomic regions.

The instant methods include the step of identifying a plurality ofgenomic regions from a group of genomic regions that are recurrentlymutated in a specific cancer. As noted elsewhere, the libraries areparticularly useful in methods for analyzing cancer-specific genealterations in solid tumors, because those alterations can be detectedin cell-free nucleic acids present in blood samples. Accordingly, thelibraries created according to these methods include genomic regionsthat are recurrently mutated in a solid tumor. In some embodiments, thesolid tumor is a carcinoma. In specific embodiments, the carcinoma is anadenocarcinoma, a non-small cell lung cancer, or a squamous cellcarcinoma. The methods are also applicable to genomic regions that arerecurrently mutated in other cancers, however. Specifically, the othercancer may be, for example, a sarcoma, a leukemia, a lymphoma, or amyeloma.

Systems

The methods for creating a library of recurrently mutated genomicregions, as disclosed herein, are typically implemented by a programmedcomputer system. Therefore, according to another aspect, the instantdisclosure provides computer systems for creating a library ofrecurrently mutated genomic regions. Such systems comprise at least oneprocessor and a non-transitory computer-readable medium storingcomputer-executable instructions that, when executed by the at least oneprocessor, cause the computer system to carry out the above-describedmethods for creating a library.

Methods for Analyzing Genetic Alterations

The libraries created according to the above-described methods areuseful in the analysis of genetic alterations, particularly in comparingtumor and genomic sequences in a patient with cancer. As shown in FIG.2, a tissue biopsy sample from the patient may be used to discovermutations in the tumor by sequencing the genomic regions of the selectorlibrary in tumor and genomic nucleic acid samples and comparing theresults. Because the selector libraries are designed to identifymutations in tumors from a large percentage of all patients, it is notnecessary to optimize the library for each patient.

Accordingly, in this aspect of the invention, methods are provided foranalyzing a cancer-specific genetic alteration in a subject comprisingthe steps of:

obtaining a tumor nucleic acid sample and a genomic nucleic acid samplefrom a subject with a specific cancer;

sequencing a plurality of target regions in the tumor nucleic acidsample and in the genomic nucleic acid sample to obtain a plurality oftumor nucleic acid sequences and a plurality of genomic nucleic acidsequences; and

comparing the plurality of tumor nucleic acid sequences to the pluralityof genomic nucleic acid sequences to identify a patient-specific geneticalteration in the tumor nucleic acid sample.

In these methods, the plurality of target regions are selected from aplurality of genomic regions that are recurrently mutated in thespecific cancer; the plurality of genomic regions comprises at least 10different genomic regions; and at least one mutation within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer. More specifically, the plurality of targetregions may correspond to the plurality of genomic regions found in thelibraries of recurrently mutated genomic regions created using theabove-described methods. In other words, in various embodiments, thenumber of different genomic regions in the plurality of genomic regions,the number of mutations within the plurality of genomic regions that arepresent in a specific percentage of all subjects with the specificcancer, the percentage of all subjects with the specific cancer with atleast one mutation within the plurality of genomic regions, the specificcomposition of the plurality of genomic regions, the types of cancer,and the cumulative length of the plurality of genomic regions have thevalues disclosed above for the methods of creating a library.

In some embodiments, the plurality of target regions used in the methodsfor analyzing a cancer-specific genetic alteration in a subjectcorresponds to the library of recurrently mutated genomic regionsdisplayed in Table 1, or a subset of those genomic regions.

It should be understood that the step of obtaining a tumor nucleic acidsample and a genomic nucleic acid sample from a subject with a specificcancer may occur in a single step or in separate steps. For example, itmay be possible to obtain a single tissue sample from a patient, forexample from a biopsy sample, that includes both tumor nucleic acids andgenomic nucleic acids. It is also within the scope of this step toobtain the tumor nucleic acid sample and the genomic nucleic acid samplefrom the subject in separate samples, in separate tissues, or even atseparate times.

The step of obtaining a tumor nucleic acid sample and a genomic nucleicacid sample from a subject with a specific cancer may also include theprocess of extracting a biological fluid or tissue sample from thesubject with the specific cancer. These particular steps are wellunderstood by those of ordinary skill in the medical arts, particularlyby those working in the medical laboratory arts.

The step of obtaining a tumor nucleic acid sample and a genomic nucleicacid sample from a subject with a specific cancer may additionallyinclude procedures to improve the yield or recovery of the nucleic acidsin the sample. For example, the step may include laboratory proceduresto separate the nucleic acids from other cellular components andcontaminants that may be present in the biological fluid or tissuesample. As noted, such steps may improve the yield and/or may facilitatethe sequencing reactions.

It should also be understood that the step of obtaining a tumor nucleicacid sample and a genomic nucleic acid sample from a subject with aspecific cancer may be performed by a commercial laboratory that doesnot even have direct contact with the subject. For example, thecommercial laboratory may obtain the nucleic acid samples from ahospital or other clinical facility where, for example, a biopsy orother procedure is performed to obtain tissue from a subject. Thecommercial laboratory may thus carry out all the steps of theinstantly-disclosed methods at the request of, or under the instructionsof, the facility where the subject is being treated or diagnosed.

Methods for Screening

The methods of the instant invention may also be applied to thedetection of cancer in a patient, where there is no prior knowledge ofthe presence of a tumor in the patient. Accordingly, in this aspect ofthe invention are provided methods for screening a cancer-specificgenetic alteration in a subject comprising the steps of:

obtaining a cell-free nucleic acid sample from a subject;

sequencing a plurality of target regions in the cell-free sample toobtain a plurality of cell-free nucleic acid sequences; and

identifying a cancer-specific genetic alteration in the cell-freesample.

In these methods, the plurality of target regions are selected from aplurality of genomic regions that are recurrently mutated in thespecific cancer. In some embodiments, the plurality of genomic regionscomprises at least 10 different genomic regions, and at least onemutation within the plurality of genomic regions is present in at least60% of all subjects with the specific cancer. More specifically, theplurality of target regions may correspond to the plurality of genomicregions found in the libraries of recurrently mutated genomic regionscreated using the above-described methods. In other words, in variousembodiments, the number of different genomic regions in the plurality ofgenomic regions, the number of mutations within the plurality of genomicregions that are present in a specific percentage of all subjects withthe specific cancer, the percentage of all subjects with the specificcancer with at least one mutation within the plurality of genomicregions, the specific composition of the plurality of genomic regions,the types of cancer, and the cumulative length of the plurality ofgenomic regions have the values disclosed above for the methods ofcreating a library.

In some embodiments, the plurality of target regions used in the methodsfor screening a cancer-specific genetic alteration in a subjectcorresponds to the library of recurrently mutated genomic regionsdisplayed in Table 1, or a subset of those genomic regions.

It will be readily apparent to one of ordinary skill in the relevantarts that other suitable modifications and adaptations to the methodsand applications described herein may be made without departing from thescope of the invention or any embodiment thereof. Having now describedthe present invention in detail, the same will be more clearlyunderstood by reference to the following Examples, which are includedherewith for purposes of illustration only and are not intended to belimiting of the invention.

Examples Noninvasive and Ultrasensitive Quantitation of CirculatingTumor DNA by Hybrid Capture and Deep Sequencing

To overcome the limitations of prior methods, an ultrasensitive andspecific strategy for analysis of cancer-derived cfDNA (CAncerPersonalized Profiling by Deep Sequencing (CAPP-Seq)) that cansimultaneously detect single nucleotide variants (SNVs),insertions/deletions (indels), and rearrangements, without the need forpatient-specific optimization has been developed. CAPP-Seq employs anadaptable “selector” to enrich recurrently mutated regions in the cancerof interest using a custom library of biotinylated DNA oligonucleotides(Ng et al. (2010) Nat. Genetics 42:30-35). To use CAPP-Seq formonitoring circulating tumor DNA, this selector is typically appliedfirst to matched tumor and normal genomic DNA to identify a patient'scancer-specific genetic aberrations and then directly to cfDNA in orderto quantify these mutations (FIG. 1 a and FIG. 2).

The design of an NSCLC CAPP-Seq selector is shown in FIG. 1( b). Phase1: Genomic regions harboring known/suspected driver mutations in NSCLC.Phases 2-4: Addition of exons containing recurrent SNVs using WES datafrom lung adenocarcinomas and squamous cell carcinomas from TCGA(N=407). Regions were selected iteratively to maximize the number ofmutations per tumor while minimizing selector size. Recurrenceindex=total unique patients with mutations covered per kb of exon.Phases 5-6: Exons of predicted NSCLC drivers (Ding et al. (2008) Nature455:1069-1075; Youn & Simon (2011) Bioinformatics 27:175-181) andintrons/exons harboring breakpoints in rearrangements involving ALK,ROS1, and RET were added. Bottom: increase of selector length duringeach design phase. FIG. 1( c) shows an analysis of the number of SNVsper lung adenocarcinoma covered by the NSCLC CAPP-Seq selector in theTCGA WES cohort (Training; N=229) and an independent lung adenocarcinomaWES data set (Validation; N=183) (Imielinski et al. (2012) Cell150:1107-1120). Results are compared to selectors randomly sampled fromthe exome (P<1.0×10⁻⁶) for the difference between random selectors andthe NSCLC CAPP-Seq selector). FIG. 1( d) shows the number of SNVs perpatient identified by the NSCLC CAPP-Seq selector in WES data from threeadenocarcinomas from TCGA, colon (COAD), rectal (READ), and endometrioid(UCEC) cancers. FIGS. 1( e) and (f) show quality parameters from arepresentative CAPP-Seq analysis of plasma cfDNA, including lengthdistribution of sequenced cfDNA fragments 1(e), and depth of sequencingcoverage across all genomic regions in the selector 1(f). FIG. 1( g)illustrates the variation in sequencing depth across cfDNA samples from4 patients. The envelope above and below the solid line representss.e.m. FIG. 2 illustrates the CAPP-Seq computational pipeline. SeeDetailed Methods section for details.

For the initial implementation of CAPP-Seq we focused on NSCLC, althoughour approach is generalizable to any cancer for which a comprehensivelist of recurrent mutations has been identified. We employed amulti-phase approach to design a NSCLC-specific selector, aiming toidentify genomic regions recurrently mutated in this disease (FIG. 1 b,Table 1, and Methods). We began by including exons covering recurrentmutations in potential driver genes from the Catalogue of SomaticMutations in Cancer (COSMIC) database (Forbes et al. (2010) NucleicAcids Res. 38:D652-657) as well as other sources (Ding et al. (2008)Nature 455:1069-1075; Youn & Simon (2011) Bioinformatics 27:175-181)(e.g. KRAS, EGFR, TP53). Next, using whole exome sequencing (WES) datafrom 407 NSCLC patients profiled by The Cancer Genome Atlas (TCGA), aniterative algorithm was applied to maximize the number of mutations perpatient while minimizing selector size. The approach relied on arecurrence index that identified known driver mutations as well asuncharacterized genes that are frequently mutated and are thereforelikely to be involved in NSCLC pathogenesis (FIG. 3 and Table 1).

TABLE 1 Recurrently mutated genomic regions in NSCLC. Coverage (uniqueLUAD Selector design Genomic region & SCC patients; n = 407) RegionsGenes Length Start End Length Patients Patients No. patients Designphase covered covered (bp) Gene Chr (bp) (bp) (bp) covered gained perexon RI Known drivers 1 1 130 AKT1 chr14 105246424 105246553 130 1 1 17.7 Known drivers 2 2 250 BRAF chr7 140453074 140453192 120 9 8 8 66.7Known drivers 3 2 369 BRAF chr7 140481375 140481493 119 16 7 7 58.8Known drivers 4 3 677 CDKN2A chr9 21970900 21971207 308 46 30 30 97.4Known drivers 5 3 1029 CDKN2A chr9 21974475 21974826 352 53 7 7 19.9Known drivers 6 4 1258 CTNNB1 chr3 41266016 41266244 229 57 4 6 26.2Known drivers 7 5 1382 EGFR chr7 55241613 55241736 124 58 1 3 24.2 Knowndrivers 8 5 1482 EGFR chr7 55242414 55242513 100 65 7 8 80.0 Knowndrivers 9 5 1669 EGFR chr7 55248985 55249171 187 69 4 5 26.7 Knowndrivers 10 5 1826 EGFR chr7 55259411 55259567 157 81 12 14 89.2 Knowndrivers 11 6 1926 ERBB2 chr17 37880164 37880263 100 81 0 0 0.0 Knowndrivers 12 6 2113 ERBB2 chr17 37880978 37881164 187 85 4 4 21.4 Knowndrivers 13 7 2293 HRAS chr11 533765 533944 180 87 2 3 16.7 Known drivers14 7 2405 HRAS chr11 534211 534322 112 90 3 3 26.8 Known drivers 15 82583 KEAP1 chr19 10599867 10600044 178 93 3 3 16.9 Known drivers 16 82790 KEAP1 chr19 10600323 10600529 207 108 15 15 72.5 Known drivers 17 83477 KEAP1 chr19 10602252 10602938 687 128 20 25 36.4 Known drivers 18 84117 KEAP1 chr19 10610070 10610709 640 141 13 18 28.1 Known drivers 19 84285 KEAP1 chr19 10597327 10597494 168 143 2 2 11.9 Known drivers 20 94465 KRAS chr12 25380167 25380346 180 147 4 4 22.2 Known drivers 21 94577 KRAS chr12 25398207 25398318 112 191 44 56 500.0 Known drivers 2210 4789 MEK1 chr15 66727364 66727575 212 191 0 0 0.0 Known drivers 23 114931 MET chr7 116411902 116412043 142 193 2 2 14.1 Known drivers 24 125199 NFE2L2 chr2 178098732 178098998 268 212 19 31 115.7 Known drivers25 13 5417 NOTCH1 chr9 139396723 139396940 218 212 0 1 4.6 Known drivers26 13 5850 NOTCH1 chr9 139399124 139399556 433 212 0 0 0.0 Known drivers27 13 7339 NOTCH1 chr9 139390522 139392010 1489 214 2 3 2.0 Knowndrivers 28 13 7489 NOTCH1 chr9 139397633 139397782 150 214 0 0 0.0 Knowndrivers 29 14 7669 NRAS chr1 115256420 115256599 180 217 3 5 27.8 Knowndrivers 30 14 7781 NRAS chr1 115258670 115258781 112 217 0 0 0.0 Knowndrivers 31 15 7907 PIK3CA chr3 178935997 178936122 126 225 8 19 150.8Known drivers 32 15 8179 PIK3CA chr3 178951881 178952152 272 228 3 414.7 Known drivers 33 16 8259 PTEN chr10 89624226 89624305 80 229 1 112.5 Known drivers 34 16 8345 PTEN chr10 89653781 89653866 86 229 0 00.0 Known drivers 35 16 8391 PTEN chr10 89685269 89685314 46 231 2 365.2 Known drivers 36 16 8436 PTEN chr10 89690802 89690846 45 231 0 00.0 Known drivers 37 16 8676 PTEN chr10 89692769 89693008 240 234 3 520.8 Known drivers 38 16 8819 PTEN chr10 89711874 89712016 143 235 1 321.0 Known drivers 39 16 8987 PTEN chr10 89717609 89717776 168 238 3 635.7 Known drivers 40 16 9213 PTEN chr10 89720650 89720875 226 239 1 313.3 Known drivers 41 17 9504 STK11 chr19 1206912 1207202 291 240 1 413.7 Known drivers 42 17 9589 STK11 chr19 1218415 1218498 85 241 1 223.5 Known drivers 43 17 9680 STK11 chr19 1219322 1219412 91 242 1 111.0 Known drivers 44 17 9814 STK11 chr19 1220371 1220504 134 242 0 429.9 Known drivers 45 17 9952 STK11 chr19 1220579 1220716 138 242 0 429.0 Known drivers 46 17 10081 STK11 chr19 1221211 1221339 129 242 0 431.0 Known drivers 47 17 10140 STK11 chr19 1221947 1222005 59 242 0 00.0 Known drivers 48 17 10329 STK11 chr19 1222983 1223171 189 242 0 00.0 Known drivers 49 17 10524 STK11 chr19 1226452 1226646 195 242 0 00.0 Known drivers 50 18 10662 TP53 chr17 7577018 7577155 138 264 22 56405.8 Known drivers 51 18 10773 TP53 chr17 7577498 7577608 111 286 22 50450.5 Known drivers 52 18 10887 TP53 chr17 7578176 7578286 114 300 14 39342.1 Known drivers 53 18 11167 TP53 chr17 7579311 7579590 280 312 12 31110.7 Known drivers 54 18 11352 TP53 chr17 7578370 7578554 185 340 28 68367.6 Max coverage 55 19 11472 REG1B chr2 79313937 79314056 120 341 1 1083.3 Max coverage 56 20 11527 TPTE chr21 10970008 10970062 55 343 2 472.7 Max coverage 57 21 11641 CSMD3 chr8 113246593 113246706 114 345 2 870.2 Max coverage 58 21 11749 TP53 chr17 7573926 7574033 108 348 3 983.3 Max coverage 59 22 11861 FAM135B chr8 139151228 139151339 112 350 28 71.4 Max coverage 60 23 11950 U2AF1 chr21 44524424 44524512 89 351 1 556.2 Max coverage 61 24 12084 THSD7A chr7 11501637 11501770 134 352 1 967.2 Max coverage 62 25 12257 MLL3 chr7 151962122 151962294 173 353 1 1163.6 Max coverage 63 26 12339 EYA4 chr6 133849862 133849943 82 354 1 561.0 Max coverage 64 27 12505 HCN1 chr5 45267190 45267355 166 355 1 954.2 Max coverage 65 28 12590 AKR1B10 chr7 134222945 134223029 85 357 25 58.8 Max coverage 66 29 12692 SLC6A5 chr11 20668379 20668480 102 358 15 49.0 Max coverage 67 30 12801 DPP10 chr2 116525872 116525980 109 360 26 55.0 Max coverage 68 31 12894 SCN7A chr2 167327124 167327216 93 361 14 43.0 Max coverage 69 32 12988 SNTG1 chr8 51621445 51621538 94 362 1 553.2 Max coverage 70 33 13093 VPS13A chr9 79946925 79947029 105 363 1 547.6 Max coverage 71 34 13240 IL1RAPL1 chrX 29938065 29938211 147 364 17 47.6 Max coverage 72 35 13408 CTNNA2 chr2 80085138 80085305 168 365 18 47.6 Max coverage 73 35 13598 CSMD3 chr8 113323206 113323395 190 366 19 47.4 Max coverage 74 36 13705 FAM5C chr1 190203501 190203607 107 367 15 46.7 Max coverage 75 37 13813 CACNA1E chr1 181708282 161708389 108 3681 4 37.0 Max coverage 76 38 14528 KRTAP5-5 chr11 1651070 1651784 715 3713 31 43.4 Max coverage 77 39 14650 PDE1C chr7 31864480 31864601 122 3721 5 41.0 Max coverage 78 40 14772 RYR2 chr1 237808626 237808747 122 3731 5 41.0 Max coverage 79 41 14896 NRXN1 chr2 50733632 50733755 124 374 15 40.3 Max coverage 80 42 15021 COL19A1 chr6 70637800 70637924 125 375 15 40.0 Max coverage 81 42 15349 CSMD3 chr8 113697634 113697961 328 376 113 39.6 Max coverage 82 43 15551 LRP1B chr2 141665445 141665646 202 3771 7 34.7 Max coverage 83 44 15709 GKN2 chr2 69173435 69173592 158 378 16 38.0 Max coverage 84 45 16031 CD5L chr1 157805624 157805945 322 379 112 37.3 Max coverage 85 46 16250 SPTA1 chr1 158627266 158627484 219 3801 8 36.5 Max coverage 86 47 16392 DHX9 chr1 182812428 182812569 142 3811 5 35.2 Max coverage 87 48 16535 ADAMTS20 chr12 43858393 43858535 143382 1 5 35.0 Max coverage 88 49 16707 NLRP4 chr19 56382192 56382363 172382 0 6 34.9 Max coverage 89 50 17199 CDH18 chr5 19473334 19473825 492384 2 17 34.6 Max coverage 90 51 17344 MYH2 chr17 10450791 10450935 145386 2 5 34.5 RI ≧ 30 91 52 18281 OR5L2 chr11 55594694 55595630 937 386 030 32.0 RI ≧ 30 92 53 19317 OR4A15 chr11 55135359 55136394 1036 386 0 3230.9 RI ≧ 30 93 54 20245 OR6F1 chr1 247875130 247876057 928 386 0 2628.0 RI ≧ 30 94 55 21176 OR4C6 chr11 55432642 55433572 931 387 1 27 29.0RI ≧ 30 95 56 22224 OR2T4 chr1 248524882 248525929 1048 387 0 33 31.5 RI≧ 30 96 56 23342 FAM5C chr1 190067147 190068264 1118 387 0 35 31.3 RI ≧30 97 57 23598 PSG2 chr19 43575851 43576106 256 387 0 9 35.2 RI ≧ 30 9858 23797 ITM2A chrX 78618438 78618636 199 387 0 6 30.2 RI ≧ 30 99 5924062 TNN chr1 175092535 175092799 265 387 0 12 45.3 RI ≧ 30 100 6024206 GATA3 chr10 8105958 8106101 144 387 0 3 20.8 RI ≧ 30 101 60 24369HCN1 chr5 45461947 45462109 183 387 0 5 30.7 RI ≧ 30 102 61 24503 OCA2chr15 28211835 28211968 134 387 0 6 44.8 RI ≧ 30 103 61 24686 CTNNA2chr2 80816428 80816610 183 387 0 5 27.3 RI ≧ 30 104 62 24863 CNTN5 chr1199715818 99715994 177 387 0 5 33.9 RI ≧ 30 105 63 25755 POM121L12 chr753103364 53104255 892 387 0 28 31.4 RI ≧ 30 106 64 25945 LRRC7 chr170225887 70226076 190 387 0 5 26.3 RI ≧ 30 107 65 26165 CNTNAP5 chr2125530375 125530594 220 387 0 8 36.4 RI ≧ 30 108 66 26313 SLC4A10 chr2162751188 162751335 148 387 0 5 33.8 RI ≧ 30 109 67 26412 SETD2 chr347142947 47143045 99 387 0 3 30.3 RI ≧ 30 110 68 26744 GFRAL chr655216050 55216381 332 387 0 10 30.1 RI ≧ 30 111 69 26837 SORCS3 chr10106927015 106927107 93 388 1 3 32.3 RI ≧ 30 112 70 27359 POTEG chr1419553416 19553937 522 388 0 17 32.6 RI ≧ 30 113 71 27489 F9 chrX138630521 138630650 130 389 1 4 30.8 RI ≧ 30 114 72 27583 SLC26A3 chr7107416896 107416989 94 389 0 2 21.3 RI ≧ 30 115 73 27753 UNC5D chr835806044 35606213 170 389 0 5 29.4 RI ≧ 30 116 74 27860 PDE4DIP chr1144882775 144882881 107 389 0 4 37.4 RI ≧ 30 117 75 27943 MRPL1 chr478870950 78871032 83 389 0 4 48.2 RI ≧ 30 118 76 28013 COL25A1 chr4109784474 109784543 70 389 0 3 42.9 RI ≧ 30 119 76 28161 SPTA1 chr1158650372 158650519 148 389 0 5 33.8 RI ≧ 30 120 77 28309 TNR chr1175331798 175331945 148 369 0 5 33.8 RI ≧ 30 121 78 28491 GALNT13 chr2155157921 155158102 182 389 0 6 33.0 RI ≧ 30 122 79 28618 EIF3E chr8109241298 109241424 127 389 0 5 39.4 RI ≧ 30 123 80 28691 SLC5A1 chr2232445929 32446001 73 389 0 4 54.8 RI ≧ 30 124 81 28757 COASY chr1740717000 40717065 66 389 0 3 45.5 RI ≧ 30 125 82 28930 TBX15 chr1119467268 119467440 173 389 0 7 40.5 RI ≧ 30 126 83 29099 PYHIN1 chr1158908869 158909037 169 389 0 6 35.5 RI ≧ 30 127 84 29164 PSG5 chr1943690493 43690557 65 389 0 3 46.2 RI ≧ 30 128 85 29262 BTRC chr10103290993 103291090 98 389 0 2 20.4 RI ≧ 30 129 86 29394 MDGA2 chr1447324226 47324357 132 389 0 4 30.3 RI ≧ 30 130 87 29454 GUCY1A3 chr4156629387 156629446 60 389 0 2 33.3 RI ≧ 30 131 88 29570 HGF chr781386504 81386619 116 389 0 4 34.5 RI ≧ 30 132 89 29656 TIMD4 chr5156346467 156346552 86 389 0 3 34.9 RI ≧ 30 133 90 29844 AK5 chr177752625 77752812 188 389 0 6 31.9 RI ≧ 30 134 91 30077 ODZ3 chr4183245173 183245405 233 389 0 7 30.0 RI ≧ 30 135 92 30177 COL5A2 chr2189927897 189927996 100 389 0 3 30.0 RI ≧ 30 136 93 30299 NTM chr11132180005 132180126 122 389 0 4 32.8 RI ≧ 30 137 94 30426 LTBP1 chr233500031 33500157 127 389 0 5 39.4 RI ≧ 30 138 95 30587 PRSS1 chr7142458405 142458565 161 389 0 5 31.1 RI ≧ 30 139 95 30794 CDKN2A chr921971001 21971207 207 389 0 26 125.6 RI ≧ 30 140 96 30922 CNGB3 chr887738758 87738885 128 389 0 4 31.3 RI ≧ 30 141 97 31049 SI chr3164777689 164777815 127 389 0 4 31.5 RI ≧ 30 142 97 31135 SI chr3164767578 164767663 86 389 0 4 46.5 RI ≧ 30 143 98 31320 TMEM132D chr12129822176 129822362 185 389 0 6 32.4 RI ≧ 30 144 99 31429 ASTN1 chr1176998769 176998877 109 389 0 3 27.5 RI ≧ 30 145 100 31571 SAGE1 chrX134987410 134987551 142 389 0 6 42.3 RI ≧ 30 146 100 31709 THSD7A chr711464322 11464459 138 389 0 5 36.2 RI ≧ 30 147 101 31907 ADAMTS12 chr533683963 33684160 198 389 0 6 30.3 RI ≧ 30 148 101 32090 NRXN1 chr250463926 50464108 183 389 0 8 43.7 RI ≧ 30 149 101 32294 CSMD3 chr8113562899 113563102 204 389 0 7 34.3 RI ≧ 30 150 101 32414 CSMD3 chr8113364644 113364763 120 389 0 5 41.7 RI ≧ 30 151 102 32504 EPB41L4B chr9112018415 112018504 90 389 0 2 22.2 RI ≧ 30 152 103 32687 POLR3B chr12106820974 106821136 163 389 0 4 24.5 RI ≧ 30 153 104 32873 ATP10B chr5160097469 180097674 208 389 0 7 34.0 RI ≧ 30 154 105 33001 CSMD1 chr83165216 3165343 128 389 0 4 31.3 RI ≧ 30 155 106 33164 FBN2 chr5127648325 127648487 163 389 0 5 30.7 RI ≧ 30 156 107 33252 EXOC5 chr1457684699 57684786 88 389 0 2 22.7 RI ≧ 30 157 108 33315 ANKRD30A chr1037440987 37441049 63 389 0 3 47.6 RI ≧ 30 158 109 33414 TRIML1 chr4189065189 189065287 99 389 0 4 40.4 RI ≧ 30 159 109 33538 SPTA1 chr1158631076 158631199 124 389 0 4 32.3 RI ≧ 30 160 110 33699 POLDIP2 chr1726684313 26684473 161 389 0 5 31.1 RI ≧ 30 161 111 33863 KLHL1 chr1370314525 70314688 164 389 0 5 30.5 RI ≧ 20 162 112 34454 TRIM58 chr1248039201 248039791 591 389 0 14 23.7 RI ≧ 20 163 113 34563 GRIA3 chrX122537262 122537370 109 389 0 3 27.5 RI ≧ 20 164 114 34777 CNOT4 chr7135048605 135048818 214 389 0 5 23.4 RI ≧ 20 165 115 34947 NAV3 chr1278582388 78582557 170 389 0 4 23.5 RI ≧ 20 166 115 35975 NAV3 chr1278400198 78401225 1028 389 0 22 21.4 RI ≧ 20 167 116 36354 TRPC5 chrX111195270 111195648 379 389 0 8 21.1 RI ≧ 20 168 117 36480 LRRC2 chr346592956 46593081 126 389 0 3 23.8 RI ≧ 20 169 118 36726 ADAMTS16 chr55239793 5240038 246 389 0 6 24.4 RI ≧ 20 170 119 36869 ACER2 chr919424697 19424839 143 389 0 3 21.0 RI ≧ 20 171 120 37103 AMOT chrX112024113 112024346 234 389 0 5 21.4 RI ≧ 20 172 121 37215 OBP2A chr9138439716 138439827 112 389 0 3 26.8 Predicted drivers 173 122 38109INHBA chr7 41729247 41730140 894 389 0 17 19.0 Predicted drivers 174 12238498 INHBA chr7 41739584 41739972 389 389 0 3 7.7 Predicted drivers 175123 38605 EPHA5 chr4 66189831 66189937 107 389 0 3 28.0 Predicteddrivers 176 123 38762 EPHA5 chr4 66197690 66197846 157 389 0 2 12.7Predicted drivers 177 123 38957 EPHA5 chr4 66201649 66201843 195 389 0 210.3 Predicted drivers 178 123 39108 EPHA5 chr4 66213771 66213921 151389 0 3 19.9 Predicted drivers 179 123 39319 EPHA5 chr4 6621710666217316 211 389 0 4 19.0 Predicted drivers 180 123 39420 EPHA5 chr466218740 66218840 101 389 0 2 19.8 Predicted drivers 181 123 39607 EPHA5chr4 66230734 66230920 187 389 0 3 16.0 Predicted drivers 182 123 39734EPHA5 chr4 66231649 66231775 127 389 0 3 23.6 Predicted drivers 183 12339835 EPHA5 chr4 66233058 66233158 101 389 0 2 19.8 Predicted drivers184 123 39936 EPHA5 chr4 66242698 66242798 101 389 0 0 0.0 Predicteddrivers 185 123 40040 EPHA5 chr4 66270091 66270194 104 389 0 2 19.2Predicted drivers 186 123 40201 EPHA5 chr4 66280001 66280161 161 389 0 16.2 Predicted drivers 187 123 40327 EPHA5 chr4 66286158 66286283 126 3890 0 0.0 Predicted drivers 188 123 40664 EPHA5 chr4 66356094 66356430 337389 0 5 14.8 Predicted drivers 189 123 40821 EPHA5 chr4 6636110566361261 157 389 0 1 6.4 Predicted drivers 190 123 41486 EPHA5 chr466467358 86468022 665 389 0 6 9.0 Predicted drivers 191 123 41588 EPHA5chr4 66509062 66509163 102 389 0 0 0.0 Predicted drivers 192 123 41770EPHA5 chr4 66535279 66535460 182 389 0 1 5.5 Predicted drivers 193 12441871 EPHA3 chr3 89156892 89156992 101 389 0 0 0.0 Predicted drivers 194124 41973 EPHA3 chr3 89176340 89176441 102 389 0 2 19.6 Predicteddrivers 195 124 42635 EPHA3 chr3 89259009 89259670 662 389 0 6 9.1Predicted drivers 196 124 42792 EPHA3 chr3 89390065 89390221 157 389 0 425.5 Predicted drivers 197 124 43129 EPHA3 chr3 89390904 89391240 337389 0 3 8.9 Predicted drivers 198 124 43255 EPHA3 chr3 89444986 89445111126 389 0 2 15.9 Predicted drivers 199 124 43445 EPHA3 chr3 8944846789448656 190 389 0 1 5.3 Predicted drivers 200 124 43549 EPHA3 chr389456418 89456521 104 389 0 0 0.0 Predicted drivers 201 124 43651 EPHA3chr3 89457198 89457299 102 389 0 0 0.0 Predicted drivers 202 124 43778EPHA3 chr3 89462290 89462416 127 389 0 3 23.6 Predicted drivers 203 12443965 EPHA3 chr3 89468354 89468540 187 389 0 1 5.3 Predicted drivers 204124 44066 EPHA3 chr3 89478236 89478336 101 389 0 0 0.0 Predicted drivers205 124 44277 EPHA3 chr3 89480299 89480509 211 389 0 4 19.0 Predicteddrivers 206 124 44428 EPHA3 chr3 89498374 89498524 151 389 0 1 6.6Predicted drivers 207 124 44623 EPHA3 chr3 89499326 89499520 185 389 0 210.3 Predicted drivers 208 124 44780 EPHA3 chr3 89521613 89521769 157389 0 3 19.1 Predicted drivers 209 124 44887 EPHA3 chr3 8952854689528652 107 389 0 1 9.3 Predicted drivers 210 125 44989 PTPRD chr98317857 8317958 102 389 0 2 19.6 Predicted drivers 211 125 45126 PTPRDchr9 8319830 8319966 137 389 0 0 0.0 Predicted drivers 212 125 45282PTPRD chr9 8331581 8331736 156 389 0 1 6.4 Predicted drivers 213 12545409 PTPRD chr9 8338921 8339047 127 389 0 2 15.7 Predicted drivers 214125 45537 PTPRD chr9 8340342 8340469 128 389 0 1 7.8 Predicted drivers215 125 45717 PTPRD chr9 8341089 8341268 180 389 0 0 0.0 Predicteddrivers 216 125 46004 PTPRD chr9 8341692 8341978 287 389 0 2 7.0Predicted drivers 217 125 46160 PTPRD chr9 8375935 8376090 156 389 0 16.4 Predicted drivers 218 125 46281 PTPRD chr9 8376606 8376726 121 389 01 8.3 Predicted drivers 219 125 46458 PTPRD chr9 8389231 8389407 177 3890 0 0.0 Predicted drivers 220 125 46583 PTPRD chr9 8404536 8404660 125389 0 0 0.0 Predicted drivers 221 125 46684 PTPRD chr9 8436590 8436690101 389 0 1 9.9 Predicted drivers 222 125 46785 PTPRD chr9 84371688437268 101 389 0 0 0.0 Predicted drivers 223 125 46899 PTPRD chr98449724 8449837 114 389 0 3 26.3 Predicted drivers 224 125 47001 PTPRDchr9 8454536 8454637 102 389 0 0 0.0 Predicted drivers 225 125 47163PTPRD chr9 8460410 8460571 162 389 0 5 18.5 Predicted drivers 226 12547374 PTPRD chr9 8465465 8465675 211 389 0 6 28.4 Predicted drivers 227125 47476 PTPRD chr9 8470989 8471090 102 389 0 1 9.8 Predicted drivers228 125 47737 PTPRD chr9 8484118 8484378 261 389 0 5 19.2 Predicteddrivers 229 125 47839 PTPRD chr9 8485226 8485327 102 389 0 0 0.0Predicted drivers 230 125 48428 PTPRD chr9 8485761 8436349 589 389 0 46.8 Predicted drivers 231 125 48547 PTPRD chr9 8492861 8492979 119 389 01 8.4 Predicted drivers 232 125 48649 PTPRD chr9 8497204 8497305 102 3890 1 9.8 Predicted drivers 233 125 48844 PTPRD chr9 8499646 8499840 195389 0 2 10.3 Predicted drivers 234 125 49151 PTPRD chr9 8500753 8501059307 389 0 3 9.8 Predicted drivers 235 125 49297 PTPRD chr9 85042608504405 146 389 0 1 6.8 Predicted drivers 236 125 49432 PTPRD chr98507300 8507434 135 389 0 1 7.4 Predicted drivers 237 125 50015 PTPRDchr9 8517847 8518429 583 389 0 9 15.4 Predicted drivers 238 125 50286PTPRD chr9 8521276 8521546 271 389 0 5 18.5 Predicted drivers 239 12550387 PTPRD chr9 8523468 8523568 101 389 0 1 9.9 Predicted drivers 240125 50499 PTPRD chr9 8524924 8525035 112 389 0 1 8.9 Predicted drivers241 125 50600 PTPRD chr9 8526585 8526685 101 389 0 0 0.0 Predicteddrivers 242 125 50702 PTPRD chr9 8527298 8527399 102 389 0 2 19.6Predicted drivers 243 125 50892 PTPRD chr9 8528590 8528779 190 389 0 421.1 Predicted drivers 244 125 51035 PTPRD chr9 8633316 8633458 143 3890 2 13.6 Predicted drivers 245 125 51182 PTPRD chr9 8636698 8636644 147389 0 2 13.6 Predicted drivers 246 125 51283 PTPRD chr9 8733761 8733861101 389 0 0 0.0 Predicted drivers 247 126 51507 KDR chr4 5594610755946330 224 389 0 1 4.5 Predicted drivers 248 126 51608 KDR chr455948115 55948215 101 389 0 0 0.0 Predicted drivers 249 126 51709 KDRchr4 55948702 55948802 101 389 0 2 19.8 Predicted drivers 250 126 51862KDR chr4 55953773 55953925 153 389 0 3 19.6 Predicted drivers 251 12651969 KDR chr4 55955034 55955140 107 389 0 2 18.7 Predicted drivers 252126 52070 KDR chr4 55955540 55955640 101 389 0 0 0.0 Predicted drivers253 126 52183 KDR chr4 55955857 55955969 113 389 0 1 8.8 Predicteddrivers 254 126 52307 KDR chr4 55956122 55956245 124 389 0 0 0.0Predicted drivers 255 126 52408 KDR chr4 55958782 55958882 101 389 0 219.8 Predicted drivers 256 128 52563 KDR chr4 55960968 55961122 155 3890 2 12.9 Predicted drivers 257 126 52665 KDR chr4 55961737 55961838 102389 0 2 19.6 Predicted drivers 258 126 52780 KDR chr4 55962395 55962509115 389 0 1 8.7 Predicted drivers 259 126 52886 KDR chr4 5596382855963933 106 389 0 3 28.3 Predicted drivers 260 126 53023 KDR chr455964303 55964439 137 389 0 0 0.0 Predicted drivers 261 126 53131 KDRchr4 55964863 55964970 108 389 0 2 18.5 Predicted drivers 262 126 53264KDR chr4 55968063 55968195 133 389 0 1 7.5 Predicted drivers 263 12653412 KDR chr4 55968528 55968675 148 389 0 2 13.5 Predicted drivers 264126 53755 KDR chr4 55970809 55971151 343 389 0 5 14.6 Predicted drivers265 126 53865 KDR chr4 55971998 55972107 110 389 0 2 18.2 Predicteddrivers 266 126 53990 KDR chr4 55972853 55972977 125 389 0 1 8.0Predicted drivers 267 126 54148 KDR chr4 55973903 55974060 158 389 0 212.7 Predicted drivers 268 126 54313 KDR chr4 55976569 55976733 165 3890 2 12.1 Predicted drivers 269 126 54429 KDR chr4 55976820 55976935 116389 0 1 8.6 Predicted drivers 270 126 54608 KDR chr4 55979470 55979648179 389 0 2 11.2 Predicted drivers 271 128 54749 KDR chr4 5598029255980432 141 389 0 0 0.0 Predicted drivers 272 126 54919 KDR chr455981040 55981209 170 389 0 1 5.9 Predicted drivers 273 126 55051 KDRchr4 55981447 55981578 132 389 0 4 30.3 Predicted drivers 274 126 55249KDR chr4 55984770 55984967 198 389 0 0 0.0 Predicted drivers 275 12655350 KDR chr4 55987260 55987360 101 389 0 1 9.9 Predicted drivers 276126 55452 KDR chr4 55991376 55991477 102 389 0 0 0.0 Predicted drivers277 127 55639 NTRK3 chr15 88420165 88420351 187 389 0 0 0.0 Predicteddrivers 278 127 55799 NTRK3 chr15 88423500 88423659 160 389 0 1 6.3Predicted drivers 279 127 55900 NTRK3 chr15 88428895 88428995 101 389 00 0.0 Predicted drivers 280 127 56145 NTRK3 chr15 88472421 88472665 245389 0 1 4.1 Predicted drivers 281 127 56319 NTRK3 chr15 8847624288476415 174 389 0 4 23.0 Predicted drivers 282 127 56451 NTRK3 chr1588483853 88483984 132 389 0 1 7.6 Predicted drivers 283 127 56571 NTRK3chr15 88522575 88522694 120 389 0 0 0.0 Predicted drivers 284 127 56707NTRK3 chr15 88524456 88524591 136 389 0 0 0.0 Predicted drivers 285 12756897 NTRK3 chr15 88576087 88576276 190 389 0 2 10.5 Predicted drivers286 127 57001 NTRK3 chr15 88669501 88669604 104 389 0 3 28.8 Predicteddrivers 287 127 57103 NTRK3 chr15 88670374 88670475 102 389 0 0 0.0Predicted drivers 288 127 57204 NTRK3 chr15 88671903 88672003 101 389 00 0.0 Predicted drivers 289 127 57502 NTRK3 chr15 88678331 88878628 298389 0 7 23.5 Predicted drivers 290 127 57645 NTRK3 chr15 8867912988679271 143 389 0 1 7.0 Predicted drivers 291 127 57789 NTRK3 chr1588679697 88679840 144 389 0 2 13.9 Predicted drivers 292 127 57948 NTRK3chr15 88680634 88680792 159 389 0 0 0.0 Predicted drivers 293 127 58050NTRK3 chr15 88690549 88690650 102 389 0 0 0.0 Predicted drivers 294 12758151 NTRK3 chr15 88726634 88726734 101 389 0 1 9.9 Predicted drivers295 127 58253 NTRK3 chr15 88727442 88727543 102 389 0 1 9.8 Predicteddrivers 296 126 58391 RB1 chr13 48878048 48878185 138 389 0 0 0.0Predicted drivers 297 128 56519 RB1 chr13 48881415 48881542 128 389 0 323.4 Predicted drivers 298 128 58636 RB1 chr13 48916734 48916850 117 3890 1 8.5 Predicted drivers 299 128 58757 RB1 chr13 48919215 48919335 121389 0 1 8.3 Predicted drivers 300 128 58859 RB1 chr13 48921929 48922030102 389 0 0 0.0 Predicted drivers 301 128 58960 RB1 chr13 4892307548923175 101 389 0 0 0.0 Predicted drivers 302 128 59072 RB1 chr1348934152 48934283 112 389 0 2 17.9 Predicted drivers 303 128 59216 RB1chr13 48936950 48937093 144 389 0 0 0.0 Predicted drivers 304 128 59317RB1 chr13 48939018 48939118 101 389 0 0 0.0 Predicted drivers 305 12859428 RB1 chr13 48941629 48941739 111 389 0 3 27.0 Predicted drivers 306128 59529 RB1 chr13 48942651 48942751 101 389 0 0 0.0 Predicted drivers307 128 59630 RB1 chr13 48947534 48947634 101 389 0 2 19.8 Predicteddrivers 308 128 59748 RB1 chr13 48951053 48951170 118 389 0 0 0.0Predicted drivers 309 128 59850 RB1 chr13 48953707 48953808 102 389 0 219.6 Predicted drivers 310 128 59951 RB1 chr13 48954154 48954254 101 3890 0 0.0 Predicted drivers 311 128 60053 RB1 chr13 48954288 48954389 102389 0 1 9.8 Predicted drivers 312 128 60251 RB1 chr13 48955382 48955579198 389 0 0 0.0 Predicted drivers 313 128 60371 RB1 chr13 4902712849027247 120 389 0 0 0.0 Predicted drivers 314 128 60518 RB1 chr1349030339 49030485 147 389 0 3 20.4 Predicted drivers 315 128 60665 RB1chr13 49033823 49033969 147 389 0 1 6.8 Predicted drivers 316 128 60771RB1 chr13 49037866 49037971 106 389 0 0 0.0 Predicted drivers 317 12860886 RB1 chr13 49039133 49039247 115 389 0 1 8.7 Predicted drivers 318128 61051 RB1 chr13 49039340 49039504 165 389 0 2 12.1 Predicted drivers319 128 61153 RB1 chr13 49047460 49047561 102 389 0 0 0.0 Predicteddrivers 320 128 61297 RB1 chr13 49050836 49050979 144 389 0 0 0.0Predicted drivers 321 128 61398 RB1 chr13 49051465 49051565 101 389 0 00.0 Predicted drivers 322 128 61499 RB1 chr13 49054120 49054220 101 3890 0 0.0 Predicted drivers 323 129 61946 ERBB4 chr2 212248339 212248785447 389 0 3 6.7 Predicted drivers 324 129 62245 ERBB4 chr2 212251577212251875 299 389 0 3 10.0 Predicted drivers 325 129 62346 ERBB4 chr2212252643 212252743 101 389 0 0 0.0 Predicted drivers 326 129 62518ERBB4 chr2 212285165 212285336 172 389 0 2 11.6 Predicted drivers 327129 62619 ERBB4 chr2 212286730 212286830 101 389 0 1 9.9 Predicteddrivers 328 129 62787 ERBB4 chr2 212288879 212289026 148 389 0 1 6.8Predicted drivers 329 129 62868 ERBB4 chr2 212293120 212293220 101 389 00 0.0 Predicted drivers 330 129 63025 ERBB4 chr2 212295669 212295825 157389 0 2 12.7 Predicted drivers 331 129 63212 ERBB4 chr2 212426627212426813 187 389 0 1 5.3 Predicted drivers 332 129 63312 ERBB4 chr2212483901 212484000 100 389 0 0 0.0 Predicted drivers 333 129 63436ERBB4 chr2 212488646 212488769 124 389 0 0 0.0 Predicted drivers 334 12963570 ERBB4 chr2 212495186 212495319 134 389 0 0 0.0 Predicted drivers335 129 63672 ERBB4 chr2 212522465 212522566 102 389 0 2 19.6 Predicteddrivers 336 129 63828 ERBB4 chr2 212530047 212530202 156 389 0 1 6.4Predicted drivers 337 129 63929 ERBB4 chr2 212537885 212537985 101 389 01 9.9 Predicted drivers 338 129 64063 ERBB4 chr2 212543776 212543909 134389 0 1 7.5 Predicted drivers 339 129 64264 ERBB4 chr2 212566691212566891 201 389 0 2 10.0 Predicted drivers 340 129 64366 ERBB4 chr2212568823 212568924 102 389 0 0 0.0 Predicted drivers 341 129 64467ERBB4 chr2 212570029 212570129 101 389 0 1 9.8 Predicted drivers 342 12964595 ERBB4 chr2 212576774 212576901 128 389 0 1 7.8 Predicted drivers343 129 64710 ERBB4 chr2 212578259 212578373 115 389 0 1 8.7 Predicteddrivers 344 129 64853 ERBB4 chr2 212587117 212587259 143 389 0 0 0.0Predicted drivers 345 129 64973 ERBB4 chr2 212589800 212589919 120 389 02 16.7 Predicted drivers 348 129 65074 ERBB4 chr2 212615346 212615446101 389 0 0 0.0 Predicted drivers 347 129 65210 ERBB4 chr2 212652749212652884 136 389 0 1 7.4 Predicted drivers 348 129 65398 ERBB4 chr2212812154 212812341 188 390 1 4 21.3 Predicted drivers 349 129 65551ERBB4 chr2 212989476 212989628 153 390 0 2 13.1 Predicted drivers 350129 65652 ERBB4 chr2 213403163 213403263 101 390 0 0 0.0 Predicteddrivers 351 130 65754 NTRK1 chr1 156785575 156785676 102 390 0 0 0.0Predicted drivers 352 130 65868 NTRK1 chr1 156811872 156811985 114 390 00 0.0 Predicted drivers 353 130 66061 NTRK1 chr1 156830726 156830938 213390 0 0 0.0 Predicted drivers 354 130 66183 NTRK1 chr1 156834132156834233 102 390 0 1 9.8 Predicted drivers 355 130 66284 NTRK1 chr1156834505 156834605 101 390 0 0 0.0 Predicted drivers 356 130 66386NTRK1 chr1 156836685 156836786 102 390 0 0 0.0 Predicted drivers 357 13066533 NTRK1 chr1 156837895 156838041 147 390 0 1 6.8 Predicted drivers358 130 66677 NTRK1 chr1 156838296 156838439 144 390 0 0 0.0 Predicteddrivers 359 130 66811 NTRK1 chr1 156841414 156841547 134 390 0 0 0.0Predicted drivers 360 130 67139 NTRK1 chr1 156843424 156843751 328 390 01 3.0 Predicted drivers 361 130 67240 NTRK1 chr1 156844133 156844233 101390 0 0 0.0 Predicted drivers 362 130 67341 NTRK1 chr1 156844340156844440 101 390 0 0 0.0 Predicted drivers 363 130 67445 NTRK1 chr1156844697 156844800 104 390 0 0 0.0 Predicted drivers 364 130 67593NTRK1 chr1 156845311 156845458 148 390 0 2 13.5 Predicted drivers 365130 67725 NTRK1 chr1 156845871 156846002 132 390 0 3 22.7 Predicteddrivers 366 130 67899 NTRK1 chr1 156846191 156846364 174 390 0 2 11.5Predicted drivers 367 130 68141 NTRK1 chr1 156848913 156849154 242 390 04 16.5 Predicted drivers 368 130 68301 NTRK1 chr1 156849790 156849949160 390 0 0 0.0 Predicted drivers 369 130 68488 NTRK1 chr1 156851248156851434 187 390 0 0 0.0 Predicted drivers 370 131 68589 NF1 chr1729422307 29422407 101 390 0 0 0.0 Predicted drivers 371 131 68734 NF1chr17 29483000 29483144 145 390 0 0 0.0 Predicted drivers 372 131 68835NF1 chr17 29486019 29486119 101 390 0 1 9.9 Predicted drivers 373 13169027 NF1 chr17 29490203 29490394 192 390 0 1 5.2 Predicted drivers 374131 89135 NF1 chr17 29496908 29497015 108 390 0 1 9.3 Predicted drivers375 131 69236 NF1 chr17 29508423 29508523 101 390 0 0 0.0 Predicteddrivers 376 131 69337 NF1 chr17 29508715 29508815 101 390 0 0 0.0Predicted drivers 377 131 69496 NF1 chr17 29509525 29509683 159 390 0 16.3 Predicted drivers 378 131 69671 NF1 chr17 29527439 29527613 175 3900 3 17.1 Predicted drivers 379 131 69795 NF1 chr17 29528054 29528177 124390 0 0 0.0 Predicted drivers 380 131 69897 NF1 chr17 29528415 29528516102 390 0 0 0.0 Predicted drivers 381 131 70030 NF1 chr17 2953325729533389 133 390 0 0 0.0 Predicted drivers 382 131 70166 NF1 chr1729541468 29541603 136 390 0 1 7.4 Predicted drivers 383 131 70281 NF1chr17 29546022 29546136 115 390 0 1 8.7 Predicted drivers 384 131 70423NF1 chr17 29548867 29549008 142 390 0 1 7.0 Predicted drivers 385 13170548 NF1 chr17 29550461 29550585 125 390 0 0 0.0 Predicted drivers 386131 70705 NF1 chr17 29552112 29552268 157 390 0 0 0.0 Predicted drivers387 131 70956 NF1 chr17 29553452 29553702 251 390 0 1 4.0 Predicteddrivers 386 131 71057 NF1 chr17 29554222 29554322 101 390 0 0 0.0Predicted drivers 389 131 71158 NF1 chr17 29554532 29554632 101 390 0 19.9 Predicted drivers 390 131 71600 NF1 chr17 29556042 29556483 442 3900 2 4.5 Predicted drivers 391 131 71741 NF1 chr17 29556852 29556992 141390 0 1 7.1 Predicted drivers 392 131 71865 NF1 chr17 29557277 29557400124 390 0 1 8.1 Predicted drivers 393 131 71966 NF1 chr17 2955785129557951 101 390 0 0 0.0 Predicted drivers 394 131 72084 NF1 chr1729559090 29559207 118 390 0 0 0.0 Predicted drivers 395 131 72267 NF1chr17 29559717 29559899 183 390 0 2 10.9 Predicted drivers 396 131 72480NF1 chr17 29560019 29560231 213 390 0 1 4.7 Predicted drivers 397 13172643 NF1 chr17 29562628 29562790 163 390 0 2 12.3 Predicted drivers 398131 72748 NF1 chr17 29562935 29563039 105 390 0 0 0.0 Predicted drivers399 131 72885 NF1 chr17 29576001 29576137 137 390 0 0 0.0 Predicteddrivers 400 131 72987 NF1 chr17 29579936 29580037 102 390 0 0 0.0Predicted drivers 401 131 73147 NF1 chr17 29585361 29585520 160 390 0 00.0 Predicted drivers 402 131 73248 NF1 chr17 29588048 29586148 101 3900 1 9.9 Predicted drivers 403 131 73396 NF1 chr17 29587386 29587533 148390 0 2 13.5 Predicted drivers 404 131 73544 NF1 chr17 29588728 29588875148 390 0 0 0.0 Predicted drivers 405 131 73656 NF1 chr17 2959224629592357 112 390 0 0 0.0 Predicted drivers 406 131 74090 NF1 chr1729652837 29653270 434 390 0 2 4.6 Predicted drivers 407 131 74432 NF1chr17 29654516 29654857 342 390 0 3 8.8 Predicted drivers 408 131 74636NF1 chr17 29657313 29657516 204 390 0 2 9.8 Predicted drivers 409 13174831 NF1 chr17 29661855 29662049 195 390 0 3 15.4 Predicted drivers 410131 74973 NF1 chr17 29663350 29683491 142 390 0 2 14.1 Predicted drivers411 131 75254 NF1 chr17 29663652 29663932 281 390 0 0 0.0 Predicteddrivers 412 131 75470 NF1 chr17 29664385 29664600 216 390 0 1 4.6Predicted drivers 413 131 75571 NF1 chr17 29664817 29664917 101 390 0 19.9 Predicted drivers 414 131 75687 NF1 chr17 29665042 29665157 116 3900 0 0.0 Predicted drivers 415 131 75790 NF1 chr17 29665721 29665823 103390 0 2 19.4 Predicted drivers 416 131 75932 NF1 chr17 29667522 29667663142 390 0 1 7.0 Predicted drivers 417 131 76060 NF1 chr17 2967002629670153 128 390 0 2 15.6 Predicted drivers 418 131 76193 NF1 chr1729676137 29676269 133 390 0 2 15.0 Predicted drivers 419 131 76330 NF1chr17 29677200 29677336 137 390 0 0 0.0 Predicted drivers 420 131 76489NF1 chr17 29679274 29679432 159 390 0 2 12.6 Predicted drivers 421 13176613 NF1 chr17 29683477 29683600 124 390 0 0 0.0 Predicted drivers 422131 76745 NF1 chr17 29683977 29684108 132 390 0 1 7.6 Predicted drivers423 131 76847 NF1 chr17 29684286 29684387 102 390 0 1 9.8 Predicteddrivers 424 131 76991 NF1 chr17 29685497 29685640 144 390 0 1 6.9Predicted drivers 425 131 77093 NF1 chr17 29685959 29686060 102 390 0 00.0 Predicted drivers 426 131 77311 NF1 chr17 29687504 29687721 216 3900 0 0.0 Predicted drivers 427 131 77455 NF1 chr17 29701030 29701173 144390 0 1 6.9 Predicted drivers 428 132 77621 APC chr5 112043414 112043579166 390 0 0 0.0 Predicted drivers 429 132 77757 APC chr5 112090587112090722 136 390 0 0 0.0 Predicted drivers 430 132 77859 APC chr5112102014 112102115 102 390 0 1 9.8 Predicted drivers 431 132 78062 APCchr5 112102885 112103087 203 390 0 2 9.9 Predicted drivers 432 132 78172APC chr5 112111325 112111434 110 390 0 1 9.1 Predicted drivers 433 13278287 APC chr5 112116486 112116600 115 390 0 0 0.0 Predicted drivers 434132 78388 APC chr5 112128134 112128234 101 390 0 0 0.0 Predicted drivers435 132 78494 APC chr5 112136975 112137080 106 390 0 0 0.0 Predicteddrivers 436 132 78594 APC chr5 112151191 112151290 100 390 0 0 0.0Predicted drivers 437 132 78974 APC chr5 112154662 112155041 380 390 0 12.6 Predicted drivers 438 132 79075 APC chr5 112157590 112157690 101 3900 0 0.0 Predicted drivers 439 132 79216 APC chr5 112162804 112162944 141390 0 0 0.0 Predicted drivers 440 132 79317 APC chr5 112163614 112163714101 390 0 0 0.0 Predicted drivers 441 132 79435 APC chr5 112164552112164669 118 390 0 2 16.9 Predicted drivers 442 132 79651 APC chr5112170647 112170862 216 390 0 0 0.0 Predicted drivers 443 132 86226 APCchr5 112173249 112179823 6575 391 1 23 3.5 Predicted drivers 444 13386327 ATM chr11 108098337 108096437 101 391 0 0 0.0 Predicted drivers445 133 86441 ATM chr11 108098502 108098615 114 391 0 1 8.8 Predicteddrivers 446 133 86588 ATM chr11 108099904 108100050 147 391 0 0 0.0Predicted drivers 447 133 86754 ATM chr11 108106396 108106561 168 391 00 0.0 Predicted drivers 448 133 86921 ATM chr11 108114679 108114845 167391 0 0 0.0 Predicted drivers 449 133 87161 ATM chr11 108115514108115753 240 391 0 1 4.2 Predicted drivers 450 133 87326 ATM chr11108117690 108117854 165 391 0 0 0.0 Predicted drivers 451 133 87497 ATMchr11 108119659 108119829 171 391 0 1 5.8 Predicted drivers 452 13387870 ATM chr11 108121427 108121799 373 391 0 0 0.0 Predicted drivers453 133 88066 ATM chr11 108122563 108122758 196 391 0 0 0.0 Predicteddrivers 454 133 88187 ATM chr11 108123541 108123641 101 391 0 1 9.9Predicted drivers 455 133 88394 ATM chr11 108124540 108124766 227 391 00 0.0 Predicted drivers 456 133 88521 ATM chr11 108126941 108127067 127391 0 1 7.9 Predicted drivers 457 133 88648 ATM chr11 108128207108128333 127 391 0 0 0.0 Predicted drivers 458 133 88749 ATM chr11108129707 108129807 101 391 0 0 0.0 Predicted drivers 459 133 88922 ATMchr11 108137897 108138069 173 391 0 1 5.8 Predicted drivers 460 13389123 ATM chr11 108139136 108139336 201 391 0 0 0.0 Predicted drivers461 133 89225 ATM chr11 108141781 108141882 102 391 0 0 0.0 Predicteddrivers 462 133 89382 ATM chr11 108141977 108142133 157 391 0 0 0.0Predicted drivers 463 133 89483 ATM chr11 108143246 108143346 101 391 00 0.0 Predicted drivers 464 133 89615 ATM chr11 108143448 108143579 132391 0 1 7.6 Predicted drivers 465 133 89734 ATM chr11 108150217108150335 119 391 0 0 0.0 Predicted drivers 466 133 89909 ATM chr11108151721 108151895 175 391 0 0 0.0 Predicted drivers 467 133 90080 ATMchr11 108153436 108153606 171 391 0 2 11.7 Predicted drivers 468 13390328 ATM chr11 108154953 108155200 248 391 0 1 4.0 Predicted drivers469 133 90445 ATM chr11 108158326 108158442 117 391 0 0 0.0 Predicteddrivers 470 133 90573 ATM chr11 108159703 108159830 128 391 0 1 7.8Predicted drivers 471 133 90774 ATM chr11 108160328 108160528 201 391 01 5.0 Predicted drivers 472 133 90950 ATM chr11 108163345 108163520 176391 0 0 0.0 Predicted drivers 473 133 91116 ATM chr11 108164039108164204 166 391 0 0 0.0 Predicted drivers 474 133 91250 ATM chr11108165653 108165786 134 391 0 0 0.0 Predicted drivers 475 133 91351 ATMchr11 108168011 108168111 101 391 0 1 9.9 Predicted drivers 476 13391524 ATM chr11 108170440 108170612 173 391 0 1 5.8 Predicted drivers477 133 91667 ATM chr11 108172374 108172516 143 391 0 0 0.0 Predicteddrivers 478 133 91845 ATM chr11 108173579 108173756 178 391 0 0 0.0Predicted drivers 479 133 92024 ATM chr11 108175401 108175579 179 391 02 11.2 Predicted drivers 480 133 92125 ATM chr11 108178617 108178717 101391 0 0 0.0 Predicted drivers 481 133 92282 ATM chr11 108180886108181042 157 391 0 0 0.0 Predicted drivers 482 133 92383 ATM chr11108183131 108183231 101 391 0 1 9.9 Predicted drivers 483 133 92485 ATMchr11 108186543 108186644 102 391 0 0 0.0 Predicted drivers 484 13392589 ATM chr11 108186737 108186840 104 391 0 1 9.6 Predicted drivers485 133 92739 ATM chr11 108188099 108188248 150 391 0 0 0.0 Predicteddrivers 486 133 92845 ATM chr11 108190680 108190785 106 391 0 0 0.0Predicted drivers 487 133 92966 ATM chr11 108192027 108192147 121 391 00 0.0 Predicted drivers 488 133 93202 ATM chr11 108196036 108196271 236391 0 1 4.2 Predicted drivers 489 133 93371 ATM chr11 108196784108196952 169 391 0 0 0.0 Predicted drivers 490 133 93486 ATM chr11108198371 108198485 115 391 0 0 0.0 Predicted drivers 491 133 93705 ATMchr11 108199747 108199965 218 391 0 1 4.6 Predicted drivers 492 13393914 ATM chr11 108200940 108201148 209 391 0 0 0.0 Predicted drivers493 133 94029 ATM chr11 108202170 108202284 115 391 0 0 0.0 Predicteddrivers 494 133 94189 ATM chr11 108202605 108202764 160 391 0 0 0.0Predicted drivers 495 133 94329 ATM chr11 106203488 108203627 140 391 00 0.0 Predicted drivers 496 133 94431 ATM chr11 108204603 108204704 102391 0 1 9.8 Predicted drivers 497 133 94573 ATM chr11 108205695108205836 142 391 0 3 21.1 Predicted drivers 498 133 94691 ATM chr11108206571 108206688 118 391 0 1 8.5 Predicted drivers 499 133 94842 ATMchr11 108213948 108214098 151 391 0 0 0.0 Predicted drivers 500 13395009 ATM chr11 108216469 108216635 167 391 0 0 0.0 Predicted drivers501 133 95111 ATM chr11 108217998 108218099 102 391 0 1 9.8 Predicteddrivers 502 133 95227 ATM chr11 108224492 108224607 116 391 0 1 8.6Predicted drivers 503 133 95328 ATM chr11 108225519 108225619 101 391 00 0.0 Predicted drivers 504 133 95466 ATM chr11 108235808 108235945 138391 0 1 7.2 Predicted drivers 505 133 95651 ATM chr11 108236051108236235 185 391 0 2 10.8 Predicted drivers 506 134 95753 FGFR4 chr5176516598 176516699 102 391 0 0 0.0 Predicted drivers 507 134 960718FGFR4 chr5 176517390 176517654 265 391 0 1 3.8 Predicted drivers 508 13496120 FGFR4 chr5 176517735 176517836 102 391 0 1 9.8 Predicted drivers509 134 96288 FGFR4 chr5 176517938 176518105 168 391 0 0 0.0 Predicteddrivers 510 134 96413 FGFR4 chr5 176518685 176518809 125 391 0 0 0.0Predicted drivers 511 134 96605 FGFR4 chr5 176519321 176519512 192 391 00 0.0 Predicted drivers 512 134 96745 FGFR4 chr5 176519646 176519785 140391 0 0 0.0 Predicted drivers 513 134 97160 FGFR4 chr5 176520138176520552 415 391 0 2 4.8 Predicted drivers 514 134 97283 FGFR4 chr5176520654 176520776 123 391 0 0 0.0 Predicted drivers 515 134 97395FGFR4 chr5 176522330 176522441 112 391 0 1 8.9 Predicted drivers 516 13497587 FGFR4 chr5 176522533 176522724 192 391 0 0 0.0 Predicted drivers517 134 97711 FGFR4 chr5 176523057 176523180 124 391 0 0 0.0 Predicteddrivers 518 134 97813 FGFR4 chr5 176523272 176523373 102 391 0 0 0.0Predicted drivers 519 134 97952 FGFR4 chr5 176523604 176523742 139 391 00 0.0 Predicted drivers 520 134 98059 FGFR4 chr5 176524292 176524398 107391 0 0 0.0 Predicted drivers 521 134 98210 FGFR4 chr5 176524527176524677 151 391 0 0 0.0 Add fusions 522 135 100435 ALK chr2 2944620729448431 2225 — — — — Add fusions 523 136 117908 ROS1 chr6 117641031117658503 17473 — — — — Add fusions 524 137 123433 RET chr10 4360665543612179 5525 — — — — Add fusions 525 138 123876 POGFRA chr4 5514069855141140 443 — — — — Add fusions 526 139 125384 FGFR1 chr8 3827574638277253 1508 — — — — Coverage (unique LUAD & SCC patients; n = 407)Coverage (all LUAD & SCC samples; n = 419) No. pa- % pa- % pa- % pa- No.No. sam- % sam- % sam- % sam- tients tients ≧1 tients ≧2 tients ≧3Samples Samples samples ples ples ≧1 ples ≧2 ples ≧3 Design phase w/1SNV SNV SNVs SNVs covered gained per exon RI w/1 SNV SNV SNVs SNVs Knowndrivers 1 0.25 0.00 0.00 1 1 1 7.7 1 0.24 0.00 0.00 Known drivers 9 2.210.00 0.00 11 10 10 83.3 11 2.63 0.00 0.00 Known drivers 16 3.93 0.000.00 18 7 7 58.8 18 4.30 0.00 0.00 Known drivers 46 11.30 0.00 0.00 4830 30 97.4 48 11.46 0.00 0.00 Known drivers 53 13.02 0.00 0.00 55 7 719.9 55 13.13 0.00 0.00 Known drivers 55 14.00 0.49 0.00 59 4 6 26.2 5714.08 0.48 0.00 Known drivers 54 14.25 0.98 0.00 60 1 3 24.2 56 14.320.95 0.00 Known drivers 60 15.97 1.23 0.00 67 7 8 80.0 62 15.99 1.190.00 Known drivers 64 16.95 1.23 0.25 71 4 5 26.7 66 16.95 1.19 0.24Known drivers 74 19.90 1.72 0.25 84 13 15 95.5 77 20.05 1.67 0.24 Knowndrivers 74 19.90 1.72 0.25 84 0 0 0.0 77 20.05 1.67 0.24 Known drivers78 20.88 1.72 0.25 88 4 4 21.4 81 21.00 1.67 0.24 Known drivers 79 21.381.87 0.25 90 2 3 16.7 82 21.48 1.91 0.24 Known drivers 82 22.11 1.970.25 93 3 3 26.8 85 22.20 1.91 0.24 Known drivers 85 22.85 1.97 0.25 963 3 16.3 88 22.91 1.91 0.24 Known drivers 100 26.54 1.97 0.25 111 15 1572.5 103 26.49 1.91 0.24 Known drivers 117 31.45 2.70 0.74 131 20 2536.4 120 31.26 2.63 0.72 Known drivers 126 34.64 3.69 0.98 145 14 1929.7 130 34.81 3.58 0.95 Known drivers 128 35.14 3.69 0.98 147 2 2 11.9132 35.08 3.58 0.95 Known drivers 132 36.12 3.69 0.98 151 4 4 22.2 13636.04 3.58 0.95 Known drivers 164 46.93 6.63 0.98 196 45 57 508.9 16946.78 6.44 0.95 Known drivers 164 46.93 6.63 0.98 196 0 0 0.0 169 46.786.44 0.95 Known drivers 166 47.42 6.63 0.98 198 2 2 14.1 171 47.26 6.440.95 Known drivers 174 52.09 9.34 0.98 217 19 31 115.7 179 51.79 9.070.95 Known drivers 173 52.09 9.58 0.98 217 0 1 4.6 178 51.79 9.31 0.95Known drivers 173 52.09 9.58 0.98 217 0 0 0.0 178 51.79 9.31 0.95 Knowndrivers 174 52.58 9.83 0.98 219 2 3 2.0 179 52.27 9.55 0.95 Knowndrivers 174 52.58 9.83 0.98 219 0 0 0.0 179 52.27 9.55 0.95 Knowndrivers 175 53.32 10.32 0.98 222 3 5 27.8 180 52.98 10.02 0.95 Knowndrivers 175 53.32 10.32 0.98 222 0 0 0.0 180 52.98 10.02 0.95 Knowndrivers 174 55.28 12.53 1.47 230 8 19 150.8 179 54.89 12.17 1.43 Knowndrivers 176 56.02 12.78 1.47 233 3 4 14.7 181 55.61 12.41 1.43 Knowndrivers 177 56.27 12.78 1.47 234 1 1 12.5 182 55.85 12.41 1.43 Knowndrivers 177 56.27 12.78 1.47 234 0 0 0.0 182 55.85 12.41 1.43 Knowndrivers 178 56.76 13.02 1.47 236 2 3 65.2 183 56.32 12.65 1.43 Knowndrivers 178 56.76 13.02 1.47 236 0 0 0.0 183 56.32 12.65 1.43 Knowndrivers 179 57.49 13.51 1.47 239 3 5 20.8 184 57.04 13.13 1.43 Knowndrivers 179 57.74 13.76 1.72 240 1 3 21.0 184 57.28 13.37 1.67 Knowndrivers 179 58.48 14.50 1.72 243 3 6 35.7 184 58.00 14.08 1.67 Knowndrivers 179 58.72 14.74 1.97 244 1 3 13.3 184 58.23 14.32 1.91 Knowndrivers 179 58.97 14.99 2.46 245 1 4 13.7 184 58.47 14.56 2.39 Knowndrivers 179 59.21 15.23 2.46 246 1 2 23.5 184 58.71 14.80 2.39 Knowndrivers 180 59.46 15.23 2.46 247 1 1 11.0 185 58.95 14.80 2.39 Knowndrivers 177 59.46 15.97 2.70 247 0 4 29.9 182 58.95 15.51 2.63 Knowndrivers 174 59.46 16.71 2.95 247 0 4 29.0 179 58.95 16.23 2.86 Knowndrivers 171 59.46 17.44 3.19 247 0 4 31.0 176 58.95 16.95 3.10 Knowndrivers 171 59.46 17.44 3.19 247 0 0 0.0 178 58.95 16.95 3.10 Knowndrivers 171 59.46 17.44 3.19 247 0 0 0.0 176 58.95 16.95 3.10 Knowndrivers 171 59.46 17.44 3.19 247 0 0 0.0 176 58.95 16.95 3.10 Knowndrivers 168 64.86 23.59 5.16 269 22 58 420.3 171 64.20 23.39 5.01 Knowndrivers 167 70.27 29.24 6.14 292 23 51 459.5 171 69.69 28.88 5.97 Knowndrivers 164 73.71 33.42 8.11 306 14 39 342.1 168 73.03 32.94 7.88 Knowndrivers 164 76.66 36.36 9.58 319 13 32 114.3 169 76.13 35.80 9.31 Knowndrivers 167 83.54 42.51 12.04 347 28 69 373.0 171 82.62 42.00 11.69 Maxcoverage 163 83.78 43.73 12.78 349 2 11 91.7 168 83.29 43.20 12.41 Maxcoverage 165 84.28 43.73 13.02 352 3 5 90.9 171 84.01 43.20 12.65 Maxcoverage 164 84.77 44.47 13.76 354 2 10 87.7 169 84.49 44.15 13.60 Maxcoverage 164 85.50 45.21 14.50 357 3 9 83.3 169 85.20 44.87 14.32 Maxcoverage 162 86.00 46.19 14.99 360 3 9 80.4 168 85.92 45.82 14.80 Maxcoverage 163 86.24 46.19 15.72 362 2 6 67.4 170 86.40 45.82 15.51 Maxcoverage 161 86.49 46.93 16.46 363 1 9 67.2 168 86.63 46.54 16.23 Maxcoverage 160 86.73 47.42 17.69 364 1 11 63.6 167 86.37 47.02 17.42 Maxcoverage 161 86.98 47.42 18.43 365 1 5 61.0 168 87.11 47.02 18.14 Maxcoverage 161 87.22 47.67 19.16 366 1 10 60.2 168 87.35 47.26 18.85 Maxcoverage 163 87.71 47.67 19.66 368 2 5 58.8 170 87.83 47.26 19.33 Maxcoverage 163 87.96 47.91 20.15 369 1 6 58.8 170 88.07 47.49 20.05 Maxcoverage 164 88.45 48.16 20.39 371 2 6 55.0 171 88.54 47.73 20.29 Maxcoverage 164 88.70 48.40 20.64 372 1 5 53.8 170 88.78 48.21 20.53 Maxcoverage 163 88.94 48.89 20.64 373 1 5 53.2 169 89.02 48.69 20.53 Maxcoverage 162 89.19 49.39 20.88 374 1 5 47.6 168 89.26 49.16 20.76 Maxcoverage 161 89.43 49.88 21.87 375 1 7 47.6 167 89.50 49.64 21.72 Maxcoverage 161 89.68 50.12 22.85 376 1 8 47.6 167 89.74 49.88 22.67 Maxcoverage 160 89.93 50.61 23.83 377 1 9 47.4 166 89.98 50.36 23.63 Maxcoverage 159 90.17 51.11 24.32 378 1 5 46.7 165 90.21 50.84 24.11 Maxcoverage 158 90.42 51.60 24.57 379 1 5 46.3 163 90.45 51.55 24.34 Maxcoverage 152 91.15 53.81 26.78 382 3 32 44.8 157 91.17 53.70 26.73 Maxcoverage 153 91.40 53.81 27.03 383 1 5 41.0 158 91.41 53.70 28.97 Maxcoverage 153 91.65 54.05 27.03 384 1 5 41.0 158 91.85 53.94 26.97 Maxcoverage 152 91.89 54.55 27.52 385 1 5 40.3 157 91.89 54.42 27.45 Maxcoverage 152 92.14 54.79 28.01 386 1 5 40.0 157 92.12 54.65 27.92 Maxcoverage 151 92.38 55.28 28.99 387 1 13 39.6 156 92.36 55.13 28.88 Maxcoverage 150 92.63 55.77 29.48 388 1 8 39.6 155 92.60 55.61 29.59 Maxcoverage 149 92.87 56.27 29.98 389 1 6 38.0 154 92.84 56.09 30.07 Maxcoverage 147 93.12 57.00 30.96 390 1 12 37.3 152 93.08 56.80 31.03 Maxcoverage 144 93.37 57.99 30.96 391 1 8 36.5 149 93.32 57.76 31.03 Maxcoverage 143 93.61 58.48 31.20 392 1 5 35.2 148 93.56 58.23 31.26 Maxcoverage 144 93.86 58.48 31.20 393 1 5 35.0 149 93.79 58.23 31.26 Maxcoverage 143 93.86 58.72 31.94 394 1 6 34.9 150 94.03 58.23 31.98 Maxcoverage 140 94.35 59.95 32.68 396 2 17 34.6 147 94.51 59.43 32.70 Maxcoverage 142 94.84 59.95 32.92 398 2 5 34.5 149 94.99 59.43 32.94 RI ≧30 134 94.84 61.92 35.63 398 0 30 32.0 141 94.99 61.34 35.56 RI ≧ 30 12694.84 63.88 37.59 398 0 34 32.8 133 94.99 63.25 37.71 RI ≧ 30 121 94.8465.11 38.33 398 0 28 30.2 127 94.99 64.68 38.42 RI ≧ 30 117 95.09 66.3439.80 399 1 28 30.1 123 95.23 65.87 39.86 RI ≧ 30 113 95.09 67.32 42.01399 0 33 31.5 119 95.23 66.83 42.00 RI ≧ 30 109 95.09 68.30 43.24 399 036 32.2 115 95.23 67.78 43.20 RI ≧ 30 105 95.09 69.29 43.24 399 0 9 35.2111 95.23 68.74 43.20 RI ≧ 30 102 95.09 70.02 43.49 399 0 6 30.2 10895.23 69.45 43.44 RI ≧ 30 99 95.09 70.76 43.73 399 0 12 45.3 105 95.2370.17 43.68 RI ≧ 30 97 95.09 71.25 43.73 399 0 5 34.7 102 95.23 70.8843.68 RI ≧ 30 94 95.09 71.99 44.23 399 0 5 30.7 99 95.23 71.80 44.15 RI≧ 30 91 95.09 72.73 44.23 399 0 7 52.2 96 95.23 72.32 44.15 RI ≧ 30 8895.09 73.46 44.23 399 0 6 32.8 93 95.23 73.03 44.15 RI ≧ 30 85 95.0974.20 44.23 399 0 6 33.9 90 95.23 73.75 44.15 RI ≧ 30 82 95.09 74.9445.21 399 0 29 32.5 87 95.23 74.46 45.11 RI ≧ 30 80 95.09 75.43 45.45399 0 6 31.6 84 95.23 75.18 45.35 RI ≧ 30 77 95.09 76.17 45.70 399 0 836.4 81 95.23 75.89 45.58 RI ≧ 30 75 95.09 76.66 45.70 399 0 5 33.8 7995.23 76.37 45.58 RI ≧ 30 73 95.09 77.15 45.95 399 0 3 30.3 77 95.2376.85 45.82 RI ≧ 30 71 95.09 77.64 45.95 399 0 11 33.1 75 95.23 77.3345.82 RI ≧ 30 70 95.33 78.13 45.95 400 1 3 32.3 74 95.47 77.80 45.82 RI≧ 30 68 95.33 78.62 47.17 400 0 17 32.6 72 95.47 78.28 47.02 RI ≧ 30 6795.58 79.12 47.17 401 1 4 30.8 71 95.70 78.76 47.02 RI ≧ 30 67 95.5879.12 47.42 401 0 3 31.9 69 95.70 79.24 47.02 RI ≧ 30 65 95.58 79.6147.42 401 0 6 35.3 67 95.70 79.71 47.02 RI ≧ 30 63 95.58 80.10 47.42 4010 4 37.4 65 95.70 80.19 47.02 RI ≧ 30 61 95.58 80.59 47.42 401 0 4 48.263 95.70 80.67 47.02 RI ≧ 30 59 95.58 81.08 47.42 401 0 3 42.9 61 95.7081.15 47.02 RI ≧ 30 57 95.58 81.57 47.42 401 0 5 33.8 59 95.70 81.6247.02 RI ≧ 30 56 95.58 81.82 47.42 401 0 7 47.3 57 95.70 82.10 47.26 RI≧ 30 54 95.58 82.31 47.42 401 0 6 33.0 55 95.70 82.58 47.26 RI ≧ 30 5295.58 82.80 47.67 401 0 5 39.4 53 95.70 83.05 47.49 RI ≧ 30 51 95.5883.05 47.67 401 0 4 54.8 52 95.70 83.29 47.49 RI ≧ 30 51 95.58 83.0548.16 401 0 3 45.5 51 95.70 83.53 47.73 RI ≧ 30 50 95.58 83.29 48.65 4010 7 40.5 50 95.70 83.77 48.21 RI ≧ 30 49 95.58 83.54 48.89 401 0 6 35.549 95.70 84.01 48.45 RI ≧ 30 48 95.58 83.78 48.89 401 0 3 46.2 46 95.7084.25 48.45 RI ≧ 30 47 95.58 84.03 48.89 401 0 3 30.6 47 95.70 84.4948.45 RI ≧ 30 46 95.58 84.28 48.89 401 0 4 30.3 46 95.70 84.73 48.45 RI≧ 30 45 95.58 84.52 48.89 401 0 3 50.0 45 95.70 84.96 48.45 RI ≧ 30 4495.58 84.77 49.14 401 0 4 34.5 44 95.70 85.20 48.69 RI ≧ 30 43 95.5885.01 49.14 401 0 3 34.9 43 95.70 85.44 48.69 RI ≧ 30 42 95.58 85.2649.63 401 0 6 31.9 42 95.70 85.68 49.16 RI ≧ 30 41 95.58 85.50 50.61 4010 7 30.0 41 95.70 85.92 50.12 RI ≧ 30 40 95.58 85.75 50.86 401 0 3 30.040 95.70 86.16 50.36 RI ≧ 30 39 95.58 86.00 50.86 401 0 4 32.8 39 95.7086.40 50.36 RI ≧ 30 38 95.58 86.24 51.11 401 0 5 39.4 38 95.70 86.6350.60 RI ≧ 30 37 95.58 86.49 51.35 401 0 5 31.1 37 95.70 86.87 50.84 RI≧ 30 36 95.58 86.73 51.60 401 0 26 125.6 36 95.70 87.11 51.07 RI ≧ 30 3595.58 86.98 51.60 401 0 4 31.3 35 95.70 87.35 51.07 RI ≧ 30 34 95.5887.22 51.84 401 0 4 31.5 34 95.70 87.59 51.31 RI ≧ 30 33 95.58 87.4752.09 401 0 4 46.5 33 95.70 87.83 51.55 RI ≧ 30 32 95.58 87.71 52.09 4010 6 32.4 32 95.70 88.07 51.55 RI ≧ 30 31 95.58 87.96 52.09 401 0 4 36.731 95.70 88.31 51.55 RI ≧ 30 30 95.58 88.21 52.33 401 0 6 42.3 30 95.7088.54 51.79 RI ≧ 30 29 95.58 88.45 52.33 401 0 5 36.2 29 95.70 88.7651.79 RI ≧ 30 28 95.58 88.70 52.58 401 0 6 30.3 28 95.70 89.02 52.03 RI≧ 30 27 95.58 88.94 52.83 401 0 8 43.7 27 95.70 89.26 52.27 RI ≧ 30 2695.58 89.19 52.83 401 0 7 34.3 26 95.70 89.50 52.27 RI ≧ 30 25 95.5889.43 53.07 401 0 5 41.7 25 95.70 89.74 52.51 RI ≧ 30 24 95.58 89.6853.07 401 0 3 33.3 24 95.70 89.96 52.51 RI ≧ 30 23 95.58 89.93 53.56 4010 5 30.7 23 95.70 90.21 53.22 RI ≧ 30 22 95.58 90.17 53.56 401 0 7 34.022 95.70 90.45 53.22 RI ≧ 30 21 95.58 90.42 53.81 401 0 4 31.3 21 95.7090.69 53.46 RI ≧ 30 20 95.58 90.66 53.81 401 0 5 30.7 20 95.70 90.9353.46 RI ≧ 30 19 95.58 90.91 53.81 401 0 3 34.1 19 95.70 91.17 53.46 RI≧ 30 18 95.58 91.15 54.05 401 0 3 47.6 18 95.70 91.41 53.70 RI ≧ 30 1795.58 91.40 54.30 401 0 4 40.4 17 95.70 91.65 53.94 RI ≧ 30 16 95.5891.65 54.55 401 0 4 32.3 16 95.70 91.89 54.18 RI ≧ 30 15 95.58 91.8954.55 401 0 5 31.1 15 95.70 92.12 54.18 RI ≧ 30 14 95.58 92.14 54.55 4010 6 36.6 14 95.70 92.36 54.18 RI ≧ 20 12 95.58 92.63 55.53 401 0 14 23.712 95.70 92.84 55.13 RI ≧ 20 11 95.58 92.87 55.53 401 0 3 27.5 11 95.7093.08 55.13 RI ≧ 20 10 95.58 93.12 55.77 401 0 5 23.4 10 95.70 93.3255.37 RI ≧ 20 9 95.58 93.37 56.27 401 0 4 23.5 9 95.70 93.56 55.85 RI ≧20 8 95.58 93.61 57.00 401 0 22 21.4 8 95.70 93.79 56.56 RI ≧ 20 7 95.5893.86 57.49 401 0 8 21.1 7 95.70 94.03 57.04 RI ≧ 20 6 95.58 94.10 57.74401 0 3 23.8 6 95.70 94.27 57.28 RI ≧ 20 5 95.58 94.35 57.99 401 0 624.4 5 95.70 94.51 57.52 RI ≧ 20 4 95.58 94.59 57.99 401 0 4 28.0 495.70 94.75 57.52 RI ≧ 20 3 95.58 94.84 58.23 401 0 6 25.6 3 95.70 94.9957.76 RI ≧ 20 2 95.58 95.09 58.23 401 0 3 26.8 2 95.70 95.23 57.76Predicted drivers 2 95.58 95.09 58.97 401 0 17 19.0 2 95.70 95.23 56.47Predicted drivers 2 95.58 95.09 59.46 401 0 3 7.7 2 95.70 95.23 58.95Predicted drivers 2 95.58 95.09 59.46 401 0 3 28.0 2 95.70 95.23 58.95Predicted drivers 2 95.58 95.09 59.46 401 0 2 12.7 2 95.70 95.23 58.95Predicted drivers 2 95.58 95.09 59.71 401 0 2 10.3 2 95.70 95.23 59.19Predicted drivers 2 95.58 95.09 59.71 401 0 3 19.9 2 95.70 95.23 59.19Predicted drivers 2 95.58 95.09 59.95 401 0 4 19.0 2 95.70 95.23 59.43Predicted drivers 2 95.58 95.09 60.44 401 0 2 19.8 2 95.70 95.23 59.90Predicted drivers 2 95.58 95.09 60.44 401 0 4 21.4 2 95.70 95.23 59.90Predicted drivers 2 95.58 95.09 60.93 401 0 3 23.6 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 2 19.8 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 0 0.0 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 2 19.2 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 1 6.2 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 0 0.0 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 5 14.8 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 1 6.4 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 6 9.0 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 0 0.0 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 1 5.5 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 0 0.0 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 2 19.6 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 60.93 401 0 6 9.1 2 95.70 95.23 60.38Predicted drivers 2 95.58 95.09 61.18 401 0 4 25.5 2 95.70 95.23 60.62Predicted drivers 2 95.58 95.09 61.43 401 0 3 8.9 2 95.70 95.23 60.86Predicted drivers 2 95.58 95.09 61.67 401 0 2 15.9 2 95.70 95.23 61.10Predicted drivers 2 95.58 95.09 61.92 401 0 1 5.3 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 0 0.0 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 0 0.0 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 3 23.6 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 1 5.3 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 0 0.0 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 5 23.7 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 61.92 401 0 1 6.6 2 95.70 95.23 61.34Predicted drivers 2 95.58 95.09 62.16 401 0 2 10.3 2 95.70 95.23 61.58Predicted drivers 2 95.58 95.09 62.65 401 0 3 19.1 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 9.3 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 2 19.6 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 0 0.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 6.4 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 2 15.7 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 7.8 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 0 0.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 2 7.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 6.4 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 8.3 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 0 0.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 0 0.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 1 9.9 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.65 401 0 0 0.0 2 95.70 95.23 62.05Predicted drivers 2 95.58 95.09 62.90 401 0 3 26.3 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 0 0.0 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 4 24.7 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 7 33.2 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 1 9.8 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 5 19.2 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 62.90 401 0 0 0.0 2 95.70 95.23 62.29Predicted drivers 2 95.58 95.09 63.14 401 0 5 8.5 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 1 8.4 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 1 9.8 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 2 10.3 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 3 9.8 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 1 6.8 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.14 401 0 1 7.4 2 95.70 95.23 62.77Predicted drivers 2 95.58 95.09 63.88 401 0 9 15.4 2 95.70 95.23 63.48Predicted drivers 2 95.58 95.09 64.13 401 0 5 18.5 2 95.70 95.23 63.72Predicted drivers 2 95.58 95.09 64.37 401 0 1 9.9 2 95.70 95.23 63.96Predicted drivers 2 95.58 95.09 64.37 401 0 1 8.9 2 95.70 95.23 63.96Predicted drivers 2 95.58 95.09 64.37 401 0 0 0.0 2 95.70 95.23 63.96Predicted drivers 2 95.58 95.09 64.37 401 0 2 19.6 2 95.70 95.23 63.96Predicted drivers 2 95.58 95.09 64.62 401 0 4 21.1 2 95.70 95.23 64.20Predicted drivers 2 95.58 95.09 64.86 401 0 3 21.0 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 2 13.6 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 0 0.0 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 1 4.5 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 0 0.0 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 2 19.8 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 3 19.6 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 2 18.7 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 0 0.0 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 1 8.8 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 0 0.0 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 2 19.8 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 2 12.9 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 64.86 401 0 3 29.4 2 95.70 95.23 64.44Predicted drivers 2 95.58 95.09 65.11 401 0 1 8.7 2 95.70 95.23 64.68Predicted drivers 2 95.58 95.09 65.11 401 0 3 28.3 2 95.70 95.23 64.68Predicted drivers 2 95.58 95.09 65.11 401 0 0 0.0 2 95.70 95.23 64.68Predicted drivers 2 95.58 95.09 65.36 401 0 2 18.5 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 1 7.5 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 2 13.5 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 5 14.6 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 66.36 401 0 2 18.2 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 1 8.0 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 2 12.7 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 2 12.1 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 1 8.6 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 2 11.2 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 0 0.0 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 1 5.9 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 4 30.3 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 0 0.0 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 1 9.9 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 0 0.0 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.36 401 0 0 0.0 2 95.70 95.23 64.92Predicted drivers 2 95.58 95.09 65.60 401 0 1 6.3 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 0 0.0 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 2 8.2 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 4 23.0 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 1 7.6 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 0 0.0 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 0 0.0 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 65.60 401 0 2 10.5 2 95.70 95.23 65.16Predicted drivers 2 95.58 95.09 66.09 401 0 3 28.8 2 95.70 95.23 65.63Predicted drivers 2 95.58 95.09 66.09 401 0 0 0.0 2 95.70 95.23 65.63Predicted drivers 2 95.58 95.09 66.09 401 0 0 0.0 2 95.70 95.23 65.63Predicted drivers 2 95.58 95.09 66.09 401 0 8 26.8 2 95.70 95.23 65.63Predicted drivers 2 95.58 95.09 66.34 401 0 1 7.0 2 95.70 95.23 65.87Predicted drivers 2 95.58 95.09 66.34 401 0 2 13.9 2 95.70 95.23 65.87Predicted drivers 2 95.58 95.09 66.34 401 0 0 0.0 2 95.70 95.23 65.87Predicted drivers 2 95.58 95.09 66.34 401 0 0 0.0 2 95.70 95.23 65.87Predicted drivers 2 95.58 95.09 66.58 401 0 1 9.9 2 95.70 95.23 66.11Predicted drivers 2 95.58 95.09 66.83 401 0 1 9.8 2 95.70 95.23 66.35Predicted drivers 2 95.58 95.09 66.83 401 0 0 0.0 2 95.70 95.23 66.35Predicted drivers 2 95.58 95.09 67.57 401 0 3 23.4 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 1 8.5 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 1 8.3 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 0 0.0 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 0 0.0 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 2 17.9 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 0 0.0 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 67.57 401 0 0 0.0 2 95.70 95.23 67.06Predicted drivers 2 95.58 95.09 68.06 401 0 3 27.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 0 0.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 2 19.8 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 0 0.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 2 19.6 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 0 0.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 1 9.8 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 0 0.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.06 401 0 0 0.0 2 95.70 95.23 67.54Predicted drivers 2 95.58 95.09 68.30 401 0 3 20.4 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 6.8 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 8.7 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 12.1 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 3 6.7 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 3 10.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 11.6 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 9.9 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 6.8 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 12.7 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 5.3 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 19.6 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 6.4 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 9.9 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 7.5 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 10.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 9.9 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 1 7.8 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 17.4 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 2 16.7 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.30 401 0 0 0.0 2 95.70 95.23 67.78Predicted drivers 2 95.58 95.09 68.55 401 0 1 7.4 2 95.70 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 1 4 21.3 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 2 13.1 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 1 9.8 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 1 6.8 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 1 3.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 2 13.5 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 3 22.7 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 2 11.5 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 4 16.5 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.55 402 0 0 0.0 3 95.94 95.23 68.02Predicted drivers 3 95.82 95.09 68.80 402 0 1 9.9 3 95.94 95.23 68.26Predicted drivers 3 95.82 95.09 68.80 402 0 1 5.2 3 95.94 95.23 68.26Predicted drivers 3 95.82 95.09 68.80 402 0 1 9.3 3 95.94 95.23 68.26Predicted drivers 3 95.82 95.09 68.80 402 0 0 0.0 3 95.94 95.23 68.26Predicted drivers 3 95.82 95.09 68.80 402 0 0 0.0 3 95.94 95.23 68.26Predicted drivers 3 95.82 95.09 69.04 402 0 1 6.3 3 95.94 95.23 68.50Predicted drivers 3 95.82 95.09 69.29 402 0 3 17.1 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 7.4 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 8.7 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 7.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 4.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 9.9 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 2 4.5 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 7.1 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 8.1 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 2 10.9 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 1 4.7 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 2 12.3 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.29 402 0 0 0.0 3 95.94 95.23 68.74Predicted drivers 3 95.82 95.09 69.53 402 0 1 9.9 3 95.94 95.23 68.97Predicted drivers 3 95.82 95.09 69.78 402 0 2 13.5 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 69.78 402 0 0 0.0 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 69.78 402 0 0 0.0 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 69.78 402 0 2 4.6 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 69.78 402 0 3 8.8 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 69.78 402 0 3 14.7 3 95.94 95.23 69.21Predicted drivers 3 95.82 95.09 70.02 402 0 3 15.4 3 95.94 95.23 69.45Predicted drivers 3 95.82 95.09 70.27 402 0 2 14.1 3 95.94 95.23 69.69Predicted drivers 3 95.82 95.09 70.27 402 0 0 0.0 3 95.94 95.23 69.69Predicted drivers 3 95.82 95.09 70.52 402 0 1 4.6 3 95.94 95.23 69.93Predicted drivers 3 95.82 95.09 70.52 402 0 1 9.9 3 95.94 95.23 69.93Predicted drivers 3 95.82 95.09 70.52 402 0 0 0.0 3 95.94 95.23 69.93Predicted drivers 3 95.82 95.09 70.76 402 0 2 19.4 3 95.94 95.23 70.17Predicted drivers 3 95.82 95.09 71.01 402 0 1 7.0 3 95.94 95.23 70.41Predicted drivers 3 95.82 95.09 71.01 402 0 2 15.6 3 95.94 95.23 70.41Predicted drivers 3 95.82 95.09 71.01 402 0 2 15.0 3 95.94 95.23 70.41Predicted drivers 3 95.82 95.09 71.01 402 0 0 0.0 3 95.94 95.23 70.41Predicted drivers 3 95.82 95.09 71.25 402 0 2 12.6 3 95.94 95.23 70.64Predicted drivers 3 95.82 95.09 71.25 402 0 0 0.0 3 95.94 95.23 70.64Predicted drivers 3 95.82 95.09 71.25 402 0 1 7.6 3 95.94 95.23 70.64Predicted drivers 3 95.82 95.09 71.50 402 0 1 9.8 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 1 6.9 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 0 0.0 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 0 0.0 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 1 6.9 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 0 0.0 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 0 0.0 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.50 402 0 1 9.8 3 95.94 95.23 70.88Predicted drivers 3 95.82 95.09 71.74 402 0 2 9.9 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 1 9.1 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 1 2.6 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 2 16.9 3 95.94 95.23 71.12Predicted drivers 3 95.82 95.09 71.74 402 0 0 0.0 3 95.94 95.23 71.12Predicted drivers 4 96.07 95.09 72.97 403 1 23 3.5 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 1 8.8 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 1 4.2 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 1 5.8 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 72.97 403 0 0 0.0 4 96.18 95.23 72.32Predicted drivers 4 96.07 95.09 73.22 403 0 1 9.9 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 7.9 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 5.8 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 7.6 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 2 11.7 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 4.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 7.8 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 5.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 9.9 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 5.8 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 2 11.2 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 9.9 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 9.6 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 4.2 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 4.6 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 0 0.0 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.22 403 0 1 9.8 4 96.18 95.23 72.55Predicted drivers 4 96.07 95.09 73.46 403 0 3 21.1 4 96.18 95.23 72.79Predicted drivers 4 96.07 95.09 73.46 403 0 1 8.5 4 96.18 95.23 72.79Predicted drivers 4 96.07 95.09 73.46 403 0 0 0.0 4 96.18 95.23 72.79Predicted drivers 4 96.07 95.09 73.46 403 0 0 0.0 4 96.18 95.23 72.79Predicted drivers 4 96.07 95.09 73.46 403 0 1 9.8 4 96.18 95.23 72.79Predicted drivers 4 96.07 95.09 73.71 403 0 1 8.6 4 96.18 95.23 73.03Predicted drivers 4 96.07 95.09 73.71 403 0 0 0.0 4 96.18 95.23 73.03Predicted drivers 4 96.07 95.09 73.96 403 0 1 7.2 4 96.18 95.23 73.27Predicted drivers 4 96.07 95.09 74.45 403 0 2 10.8 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 1 3.8 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 1 9.8 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 2 4.6 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 1 8.9 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Predicted drivers 4 96.07 95.09 74.45 403 0 0 0.0 4 96.18 95.23 73.75Add fusions — — — — — — — — — — — — Add fusions — — — — — — — — — — — —Add fusions — — — — — — — — — — — — Add fusions — — — — — — — — — — — —Add fusions — — — — — — — — — — — —

FIG. 3 illustrates how the statistical enrichment of recurrently mutatedNSCLC exons captures known drivers. Two metrics were employed toprioritize exons with recurrent mutations for inclusion in the CAPP-SeqNSCLC selector. The first, termed Recurrence Index (RI), is defined asthe number of unique patients (i.e. tumors) with somatic mutations perkilobase of a given exon and the second metric is based on the minimumnumber of unique patients (i.e. tumors) with mutations in a given kb ofexon. Exons containing at least one non-silent SNV genotyped by TCGA(n=47,769) in a combined cohort of 407 lung adenocarcinoma (LUAD) andsquamous cell carcinoma (SCC) patients were analyzed. As shown in FIG.3( a), known/suspected NSCLC drivers are highly enriched at RI≧30(inset), comprising 1.8% (n=861) of analyzed exons. As shown in FIG. 3(b), known/suspected NSCLC drivers are highly enriched at ≧3 patientswith mutations per exon (inset), encompassing 16% of analyzed exons.

Approximately 8% of NSCLCs contain clinically actionable rearrangementsinvolving the receptor tyrosine kinases, ALK, ROS1 and RET (Bergethon etal. (2012) J. Clin. Oncol. 30:863-870; Kwak et al. (2010) N. Engl. J.Med. 363:1693-1703; Pao & Hutchinson (2012) Nat. Med. 18:349-351). Toutilize the personalized nature and low false detection rate ofstructural rearrangements (Leary et al. (2010) Sci. Transl. Med.2:20ra14; McBride et al. (2010) Genes Chrom. Cancer 49:1062-1069),introns and exons spanning recurrent fusion breakpoints in these geneswere included in the final design phase (FIG. 1 b). To detect fusions intumor and plasma DNA, a breakpoint-mapping algorithm called FACTERA wasdeveloped (FIG. 4). Application of FACTERA to next generation sequencing(NGS) data from 2 NSCLC cell lines known to harbor fusions withpreviously uncharacterized breakpoints (Koivunen et al. (2008) Clin.Cancer Res. 14:4275-4283; Rikova et al. (2007) Cell 131:1190-1203)readily identified the breakpoints in both cases (FIG. 5).

Collectively, the NSCLC CAPP-Seq selector design targets 521 exons and13 introns from 139 recurrently mutated genes, in total covering ˜125 kb(FIG. 1 b). Within this small target (0.004% of the human genome),CAPP-Seq identifies a median of 4 point mutations and covered 96% ofpatients with lung adenocarcinoma or squamous cell carcinoma. Tovalidate the number of mutations covered per tumor, we examined theselector region in WES data from an independent cohort of 183 lungadenocarcinoma patients (Imielinski et al. (2012) Cell 150:1107-1120).The selector covered 88% of patients with a median of 4 SNVs perpatient, thus validating our selector design algorithm (P<1.0×10⁻⁶; FIG.1 c). When compared to randomly sampling the exome, regions targeted byCAPP-Seq captured ˜4-fold as many mutations per patient (at the median,FIG. 1 c). Due to similarities in key oncogenic machinery across cancers(Hanahan & Weinberg (2011) Cell 144:646-674), we hypothesized that ourNSCLC selector would perform favorably on other carcinomas. Indeed, whenapplied to TCGA WES data, the selector successfully captured 99% ofcolon, 98% of rectal, and 97% of endometrioid uterine carcinomas, with amedian of 12, 7, and 3 mutations per patient, respectively (FIG. 1 d).This demonstrates the value of targeting hundreds of recurrently mutatedgenomic regions and suggests that a CAPP-Seq selector could be designedto simultaneously cover mutations for a wide variety of humanmalignancies.

Using this CAPP-Seq selector, we profiled a total of 52 samplesincluding NSCLC cell lines, primary tumor specimens, peripheral bloodleukocytes (PBLs), and cfDNA isolated from plasma of patients with NSCLCbefore and after various cancer therapies (Table 2). To assess andoptimize the performance of CAPP-Seq, we first applied it to cfDNApurified from healthy control plasma. Approximately 60% of reads mappedwithin the selector target region (Table 2). Sequenced cfDNA fragmentshad a median length of 169 bp (FIG. 1 e), closely corresponding to thelength of DNA contained within a chromatosome (Fan et al. (2008) Proc.Natl Acad. Sci. USA 105:16266-16271). To optimize library preparationfrom small quantities of cfDNA we explored a variety of modifications tothe ligation and post-ligation amplification steps includingtemperature, incubation time, enzyme source, and “with-bead” clean-up.The optimized protocol increased recovery efficiency by >300% anddecreased bias for libraries constructed from as little as 4 ng of cfDNA(FIGS. 6-8). Consequently, fluctuations in sequencing depth were minimal(FIG. 1 f-g) and unlikely to impact performance.

TABLE 2 Profile of samples using NSCLC CAPP-Seq selector DNA LibraryFraction of mass used mass used Total properly Read on- Median forlibrary for capture reads paired target Median fragment Sample (ng) (ng)mapped reads rate depth length H3122 0.1% into HCC78 128 111 99.0% 96.8%69.5% 8688 173 H3122 1% into HCC78 128 111 98.9% 96.7% 69.8% 8657 171H3122 10% into HCC78 128 111 98.9% 96.5% 69.8% 6890 170 H3122 100% 128111 99.0% 96.8% 68.6% 6739 174 HCC78 100% 128 111 99.0% 96.9% 69.7% 7602172 cfDNA 100% 6 cycles 32 83.3 97.5% 86.7% 60.3% 8280 168 HCC78 10%into cfDNA 4 cycles 128 83.3 97.5% 83.3% 59.3% 2682 170 HCC78 10% intocfDNA 8 cycles SigmaWGA 624 83.3 79.5% 72.0% 50.4% 15 158 HCC78 10% intocfDNA 6 cycles 32 83.3 97.7% 87.2% 60.4% 8261 169 HCC78 10% into cfDNA 8cycles NEBNextOvernightBead 32 83.3 96.9% 91.8% 61.1% 6258 166 HCC78 10%into cfDNA 8 cycles OrigNEBNext 15 minLig 32 83.3 98.0% 93.1% 60.9% 9862167 HCC78 10% into cfDNA 4 ng 9 cycles 4 83.3 97.6% 87.6% 60.5% 11630169 P11 PBL 500 83.3 96.7% 93.8% 59.0% 6970 169 P11 Tumor 500 83.3 93.4%88.3% 61.3% 7700 156 P6 PBL 500 83.3 96.7% 92.6% 67.2% 3848 152 P6 Tumor1000 83.3 87.0% 81.8% 64.7% 2445 158 P8 PBL 500 83.3 96.9% 93.0% 65.8%4021 154 P8 Tumor 500 83.3 91.7% 85.4% 63.6% 5331 151 P10 PBL 400 83.396.9% 93.6% 65.3% 4572 161 P10 Tumor 500 83.3 94.0% 89.6% 65.1% 5335 157P7 PBL 500 83.3 97.1% 93.5% 67.1% 3552 155 P7 Tumor 500 83.3 94.1% 89.3%64.0% 4793 162 HCC78 0.025% into cfDNA 32 83.3 98.2% 87.0% 46.3% 3913169 HCC78 0.05% into cfDNA 32 83.3 98.1% 86.1% 44.7% 6549 169 HCC78 0.1%into cfDNA 32 83.3 98.4% 88.1% 44.9% 6897 169 HCC78 0.5% into cfDNA 3283.3 98.8% 89.8% 46.2% 8096 169 HCC78 1% into cfDNA 32 83.3 98.5% 89.8%46.5% 7779 171 P6-1 cfDNA 17 83.3 98.6% 91.3% 46.4% 11172 166 P6-2 cfDNA20 83.3 98.5% 92.0% 46.6% 8455 166 P9 PBL 500 83.3 97.0% 94.4% 59.2%5441 172 P9 Tumor 69 83.3 99.2% 97.3% 55.3% 7312 239 P3 PBL 500 83.399.3% 97.8% 57.0% 8838 235 P3 Tumor 500 83.3 99.3% 98.0% 66.0% 9562 204P2 PBL 500 83.3 99.2% 97.5% 57.7% 7680 235 P2 Tumor 500 83.3 99.0% 97.1%62.3% 7247 204 P4 PBL 500 83.3 99.1% 96.5% 56.5% 7331 227 P4 Tumor 20083.3 97.5% 94.1% 60.0% 3968 189 P1 PBL 500 83.3 99.3% 97.1% 57.1% 7336220 P1 Tumor 500 83.3 94.6% 90.1% 60.9% 976 192 P5 PBL 500 83.3 99.2%97.2% 58.7% 8155 219 P5 Tumor 100 83.3 98.8% 97.0% 63.5% 6930 187 P9-1cfDNA 12 83.3 99.1% 84.2% 65.6% 6839 172 P9-2 cfDNA 17 83.3 98.4% 83.9%65.2% 6043 169 P9-3 cfDNA 16 83.3 99.4% 88.7% 67.6% 8141 167 P3-1 cfDNA15 83.3 99.2% 86.0% 63.5% 7057 170 P3-2 cfDNA 16 83.3 99.3% 86.5% 63.5%10089 171 P2-1 cfDNA 13 83.3 99.4% 86.9% 67.3% 6876 172 P2-2 cfDNA 1683.3 99.5% 96.4% 63.6% 5248 185 P1-1 cfDNA 13 83.3 99.0% 85.0% 64.6%5079 171 P1-2 cfDNA 7 83.3 99.4% 84.7% 64.1% 6487 172 P5-1 cfDNA 9 83.399.3% 87.8% 66.6% 7604 169 P5-2 cfDNA 15 83.3 99.4% 88.0% 67.5% 10451170

FIG. 6 illustrates the improvements in CAPP-Seq performance achievedwith optimized library preparation procedures. Using 32 ng of inputcfDNA from plasma, standard versus “with bead” (Fisher et al. (2011)Genome biology 12:R1) library preparation methods were compared, as wellas two commercially available DNA polymerases (Phusion and KAPA HiFi).Template pre-amplification by Whole Genome Amplification (WGA) usingDegenerate Oligonucleotide PCR (DOP) were also compared. Indicesconsidered for these comparisons included (a) length of the capturedcfDNA fragments sequenced, (b) depth and uniformity of sequencingcoverage across all genomic regions in the selector, and (c) sequencemapping and capture statistics, including uniqueness. Collectively,these comparisons identified KAPA HiFi polymerase and a “with bead”protocol as having most robust and uniform performance.

FIG. 7 illustrates the optimization of allele recovery from low inputcfDNA during Illumina library preparation. Bars reflect the relativeyield of CAPP-Seq libraries constructed from 4 ng cfDNA, calculated byaveraging quantitative PCR measurements of 4 pre-selected reporterswithin CAPP-Seq with pre-defined amplification efficiencies. (a) Sixteenhour ligation at 16° C. increases ligation efficiency and reporterrecovery. (b) Adapter ligation volume did not have a significant effecton ligation efficiency and reporter recovery. (c) Performing enzymaticreactions “with-bead” to minimize tube transfer steps increases reporterrecovery. (d) Increasing adapter concentration during ligation increasesligation efficiency and reporter recovery. Reporter recovery is alsohigher when using KAPA HiFi DNA polymerase compared to Phusion DNApolymerase (e) and when using the KAPA Library Preparation Kit with themodifications in a-d compared to the NuGEN SP Ovation Ultralow LibrarySystem with automation on a Mondrian SP Workstation (f). Relativereporter abundance was determined by qPCR using the 2^(−ΔCt) method. Allvalues are mean±s.d. N.S., not significant. Based on these results, itwas estimated that combining the methodological modifications in a andc-e improves yield in NGS libraries by 3.3-fold.

FIG. 7 illustrates the performance of CAPP-Seq with various amounts ofinput cfDNA. (a) Length of the captured cfDNA fragments sequenced. (b)Depth of sequencing coverage across all genomic regions in the selector.(c) Sequence mapping and capture statistics. As expected, more inputcfDNA mass correlates with more unique fragments sequenced.

The detection limit of CAPP-Seq is affected by the absolute number ofavailable cfDNA molecules in a given volume of peripheral blood, as wellas PCR and sequencing errors (i.e. “technical” background). The latterprimarily affects substitutions/SNVs as opposed to other CAPP-Seqreporters (i.e., indels (Minoche et al. (2011) Genome Biol. 12:R112) andrearrangements). Separately, mutant cfDNA could be present in theabsence of cancer due to contributions from pre-neoplastic cells fromdiverse tissues (i.e., “biological” background). The combined backgroundfrom these sources was measured by assessing the error rate at eachnucleotide position across the selector in plasma cfDNA from 6 patientsand a healthy individual, excluding tumor-derived mutations. Mean andmedian background rates of ˜0.007% and ˜0% (not detected, N.D.) werefound, respectively (FIG. 9 (a)). Next, we hypothesized that ifsignificant biological background is present, it should be highest forrecurrently mutated positions in cancer driver genes. We thereforeanalyzed mutation rates of 107 recurrent cancer-associated SNVs (Su etal. (2011) J. Mol. Diagn. 13:74-84) in the same 7 plasma samples, againexcluding those SNVs found in corresponding tumors. Though the medianfractional abundance was comparable (˜0%, N.D.), the mean was marginallyhigher at 0.012% (FIG. 9 (b)). However, only one cancer-associatedmutation (TP53 R175H) was detectable in plasma at levels significantlyabove global background (P<0.01). Since this allele was detected at amedian frequency of ˜0.3% across all samples (FIG. 9( c)), wehypothesize that it reflects true biological background and thusexcluded it as a potential CAPP-Seq reporter. Collectively, thisanalysis suggests that biological background is not a significant factorfor disease monitoring at the current detection limits of CAPP-Seq.

Next, the allele frequency detection limit and linearity of CAPP-Seq wasbenchmarked by spiking defined concentrations of fragmented genomic DNAfrom a NSCLC cell line into cfDNA from a healthy individual (FIG. 9( d))or into genomic DNA from a second NSCLC line (FIG. 10( a)). CAPP-Seqaccurately detected variants at fractional abundances between 0.025% and10% with high linearity (R²≧0.994). Analyses of the influence of thenumber of SNV reporters on error metrics showed only marginalimprovements above a threshold of 4 reporters per tumor (FIGS. 9(e)-(f), 10 (b)-(c)), equivalent to the median number of SNVs per NSCLCidentified by the NSCLC selector. Finally, whether fusion breakpointsand indels could also serve as linear reporters was tested. It was foundthat the fractional abundance of these mutations correlated highly withexpected concentrations (R²≧0.995; FIG. 10( d)).

Having designed, optimized, and benchmarked CAPP-Seq, it was applied tothe discovery of somatic mutations in tumors collected from a diversegroup of NSCLC patients (n=11; FIG. 11( a) and Table 3). To test thebreakpoint enumeration capability of CAPP-Seq, 6 patients withclinically confirmed fusions were included. These translocations servedas positive controls, along with SNVs in other tumors previouslyidentified by clinical assays (N=9; Table 3). Tumor samples includedformalin fixed surgical or biopsy specimens and pleural fluid. At a meansequencing depth of ˜6,000× in tumor and paired germline samples,CAPP-Seq confirmed all previously identified SNVs and fusions (3 and 8,respectively) and discovered many additional somatic variants (FIG. 11(a) and Table 3). Moreover, CAPP-Seq characterized breakpoints andpartner genes at base pair resolution for each of the 8 rearrangements(FIG. 12). Tumors containing fusions were almost exclusively from neversmokers and, as expected (Govindan et al. (2012) Cell 150:1121-1134),contained fewer SNVs than those lacking fusions (FIG. 13). Excludingpatients with fusions (<10% of the TCGA design cohort), CAPP-Seqidentified a median of 4 SNVs per patient as we had predicted (FIG. 1(b)-(c)).

TABLE 3 Characteristics of patients used for noninvasive detection andmonitoring of circulating tumor DNA by CAPP-Seq. SNVs by Fusions Gradeand Other TNM Stage Pack- Tumor Germline Clinical Detected Case Age SexHistology Histological Features Stage Group Smoker Years Source SourceAssays by FISH P1 66 M Adeno- Papillary type T2aN0M0 B Yes 20 FFPEFrozen carcinoma cores PBL P2 61 M Large Cell NOS T3N1M0 IIIA Yes 80FFPE Frozen cores PBL P3 67 F Adeno- Acinar type T1bN3M0 IIIB Yes 15FFPE Frozen carcinoma cores PBL P4 47 F Adeno- Micropapillary andT2aN2M1b IV Yes 45 FFPE Frozen KRAS G13D carcinoma papillary type coresPBL P5 49 F Adeno- Well differentiated T1bN0M1a IV No 0 FFPE Frozen EGFRL858R; carcinoma cores PBL EGFR T790M P6 54 M Adeno- NOS T3N2M1b IV No 0Fresh Frozen ALK carcinoma PBL P7 50 M Adeno- Poorly differentiatedT1aN2M1b IV Yes 4 FFPE Frozen ALK carcinoma cores PBL P8 48 F Adeno-Mutinous type T4N0M1b IV No 0 FFPE Frozen ALK carcinoma cores PBL P9 49M Adeno- Not otherwise T4N3M1a IV No 0 Fresh Frozen ALK carcinomaspecified (NOS) PBL P10 35 F Adeno- NOS T4N0M0 IIIA No 0 FFPE FrozenROS1 carcinoma cores PBL P11 38 F Adeno- Well-to-moderately T3N2M0 IIIANo 0 FFPE Frozen ROS1 carcinoma differentiated cores PBL : Related toFIGS. 11 (a) and 14, regarding smoking history, ≧20 pack years wasconsidered heavy and >0 pack years was considered light.

To explore the potential clinical utility of CAPP-Seq for diseasemonitoring and minimal residual disease detection, we next appliedCAPP-Seq to serial plasma samples collected from a subset of these same11 patients (N=6), all of whom had pre- and post-treatment samplesavailable (FIG. 11; Table 4). Starting from ˜15 ng of plasma cfDNA (˜3mL of peripheral blood) and sequenced to a mean depth of nearly 8,000×(Table 3), CAPP-Seq detected cancer-derived cfDNA in both early andadvanced stage patients (Table 4). Among patients with SNV or indelreporters, all showed a significant reduction in cancer cfDNA burdenfollowing treatment, consistent with radiographic response assessment bycomputed tomography (CT) (FIG. 11( a)). These included two patients—onewith stage IB adenocarcinoma (P1) and another with stage IIIA large cellcarcinoma (P2)—who underwent surgery with complete tumor resection (FIG.11( b)). Post-treatment cancer-derived cfDNA was undetectable in theStage I patient but was above background for the Stage IIIA patientsuggesting that residual cancer cells remained after surgery even thougha complete resection was thought to have been achieved. In a third case(P6), CAPP-Seq detected 3 SNVs and a KIF5B-ALK fusion, and both mutationtypes reported similar fractional abundances of mutant cfDNA (FIG. 14).Next, we analyzed a patient with 3 fusions and no detectable SNVs/indels(P9), but from whom 3 serial cfDNA samples were collected. Abundance offusion product in the plasma was highly correlated with tumor burden andcorrectly indicated initial response to therapy followed by relapse(R²=0.97; FIG. 11( c)). Finally, in a fifth patient (P5), CAPP-Seqidentified a sub-clonal population harboring the T790M EGFR gatekeepermutation (Kobayashi et al. (2005) N. Engl. J. Med. 352:786-792) (FIG.11( d)). The ratio between clones was identical in the tumor andpre-treatment plasma cfDNA but changed after treatment with cytotoxicchemotherapy followed by a 3^(rd) generation EGFR inhibitor (FIG. 11(d), inset), suggesting that CAPP-Seq can detect clinically relevantsubclones and monitor clonal dynamics during therapy. Taken together,these data demonstrate the potential utility of CAPP-Seq as anoninvasive clinical assay for measuring tumor burden in early andadvanced stage NSCLC and for monitoring tumor-derived cfDNA duringtherapy.

TABLE 4 Monitoring of cfDNA in patients using CAPP-Seq. Time point 1Time point 2 Time point 3 Mu- Mu- Mu- Mu- Mu- Mu- Mu- tant tant tanttant tant tant tant Ref. allele Total allele Final allele Total alleleFinal allele Total allele Final Case allele allele Chr Position depthdepth % % depth depth % % depth depth % % P1 A G chr1 156785560 0 45720.000 0.000 3 6202 0.048 0.048 — — — — P1 T G chr1 157806043 0 18380.000 0.000 0 2266 0.000 0.000 — — — — P1 G C chr1 248525206 0 28280.000 0.000 0 4529 0.000 0.000 — — — — P1 C T chr2 33500291 1 943 0.1060.106 0 943 0.000 0.000 — — — — P1 A C chr4 55946307 0 6856 0.000 0.0000 8817 0.000 0.000 — — — — P1 G A chr4 55963949 0 5742 0.000 0.000 07335 0.000 0.000 — — — — P1 A C chr4 55968672 0 5856 0.000 0.000 0 74310.000 0.000 — — — — P1 C T chr6 117642146 0 5266 0.000 0.000 4 68490.058 0.058 — — — — P1 T G chr9 8376700 3 5535 0.054 0.054 0 7322 0.0000.000 — — — — P1 T C chr9 8733625 1 827 0.121 0.121 0 1398 0.000 0.000 —— — — P1 T G chr10 43611663 0 3722 0.000 0.000 0 4565 0.000 0.000 — — —— P1 T G chr15 88522525 1 4919 0.020 0.020 4 6736 0.059 0.059 — — — — P1+G  C chr17 7578474 0 1762 0.000 0.000 0 2373 0.000 0.000 — — — — P1 −A G chr17 29552244 1 4484 0.022 0.022 0 6485 0.000 0.000 — — — — P1 +T  Cchr17 29553484 0 3657 0.000 0.000 0 4713 0.000 0.000 — — — — P1 −T  Cchr17 29592185 3 3694 0.081 0.081 0 3247 0.000 0.000 — — — — P2 A C chr250463926 49 6724 0.729 1.457 0 4981 0.000 0.000 — — — — P2 G A chr389457148 40 4838 0.827 0.827 0 4311 0.000 0.000 — — — — P2 T G chr389468286 5 4667 0.107 0.214 2 3625 0.055 0.110 — — — — P2 T A chr389480240 15 5073 0.296 0.591 0 4321 0.000 0.000 — — — — P2 T A chr466189669 4 950 0.421 0.842 5 1436 0.348 0.696 — — — — P2 T G chr466242868 16 2107 0.759 0.759 0 1655 0.000 0.000 — — — — P2 A C chr5176522747 46 2220 2.072 2.072 0 1377 0.000 0.000 — — — — P2 C T chr6117648229 70 7819 0.895 1.791 0 5985 0.000 0.000 — — — — P2 A C chr1278400637 35 7907 0.443 0.885 1 6326 0.016 0.032 — — — — P2 T G chr1278400910 106 8211 1.291 2.582 1 6289 0.016 0.032 — — — — P2 T C chr177577551 112 5629 1.990 1.990 2 3814 0.052 0.052 — — — — P2 T G chr191207247 15 1124 1.335 2.669 0 747 0.000 0.000 — — — — P2 +A  C chr279314100 16 3280 0.488 0.98 0 2390 0.000 0.000 — — — — P3 A C chr177578253 6 6345 0.095 0.095 0 8583 0.000 0.000 — — — — P5 T C chr755249071 42 4736 0.887 0.887 10 5597 0.179 0.179 — — — — P5 G T chr755259515 503 11349 4.432 4.432 58 12222 0.475 0.475 — — — — P5 A G chr1155135338 86 4063 2.117 2.117 10 4798 0.208 0.208 — — — — P5 T C chr177577097 227 7429 3.056 3.056 36 9723 0.370 0.370 — — — — P6 A G chr1278400791 84 13970 0.601 1.203 28 10128 0.276 0.553 — — — — P6 T G chr12129822187 78 8680 0.899 1.797 9 6604 0.136 0.273 — — — — P6 A G chr177576275 140 9376 1.493 1.493 22 7897 0.279 0.279 — — — — P6 KIF5B- —chr10/ — 28 15006 0.187 3.116 2 9989 0.020 0.334 — — — — ALK chr2 P9EML4- — chr2/ — 0 10688 0.000 0.000 0 13647 0.000 0.000 0 13521 0.0000.000 ALK chr2 P9 FYN- — chr6/ — 0 9261 0.000 0.000 0 6826 0.000 0.000 210693 0.019 0.019 ROS1 chr6 P9 ROS1- — chr6/ — 10 8029 0.125 0.125 16485 0.015 0.015 13 9943 0.131 0.131 MKX chr10 Bolded reporters indicatepotential homozygous alleles (see Table 3 and Detailed Methods). Notethat mutant cfDNA percentages for P5 were calculated from the 3 SNVsrepresenting the dominant clone (see FIGS. 11 (a) and 11 (d)); EGFRT790M (chr7: 55249071 C−>T) was not included. Final allelic percentagesreflect any adjustments made based on estimated zygosity (using inferredhomozygous reporters) and/or sequencing coverage. See Detailed Methodsfor details.

In addition to its potential clinical utility, CAPP-Seq analysispromises to yield novel biological insights. For example, in onepatient's tumor (P9), we identified both a classic EML4-ALK fusion andtwo previously unreported fusions involving ROS1: FYN-ROS1 and ROS1-MKX(FIG. 11( e), FIG. 15). While the potential function of these novel ROS1fusions is unknown, to the best of our knowledge this is the firstobservation of ROS1 and ALK fusions in the same NSCLC patient. Allfusions were confirmed by qPCR amplification of genomic DNA, and wereindependently recovered in plasma samples (Table 4). Separately, amongcases with a ROS1 rearrangement, we found an unexpected enrichment forS34F missense mutations in U2AF1, the 35 kD subunit of the U2spliceosomal complex auxiliary factor. This SNV was initially describedas a recurrent heterozygous mutation in myelodysplastic syndrome(Graubert et al. (2012) Nat. Genet. 44:53-57; Yoshida et al. (2011)Nature 478:64-69). While U2AF1 mutations (Imielinski et al. (2012) Cell150:1107-1120) and ROS1 translocations (Bergethon et al. (2012) J. Clin.Oncol. 30:863-870) were recently reported to occur individually in ˜3%and ˜1.7% of lung adenocarcinomas, respectively, combining the sampleswe profiled with publicly available data (Detailed Methods), we observeda significant enrichment for U2AF1 S34F mutations tumors harboring ROS1fusions (in 3 of 6; P=0.0019; FIG. 11( f), FIG. 16 and DetailedMethods).

Finally, we explored whether CAPP-Seq analysis of cfDNA couldpotentially be used for cancer screening. As proof-of-principle, weblinded ourselves to the mutations present in each patient's tumor anddeveloped a statistical method to test for the presence of cancer DNA ineach pre-treatment plasma sample in our cohort (FIG. 17). This methodidentified mutant DNA in all plasma samples containing tumor-derivedmutant alleles above fractional abundances of 0.5%. Mutant DNA belowthis level could not be detected by our algorithm, but no mutations werefalsely called, indicating the high specificity of this approach (FIG.11( g) and Detailed Methods). Since ˜95% of nodules identified inpatients at high risk for NSCLC by low-dose CT are false positives(Aberle et al. (2011) N. Engl. J. Med. 365:395-409), CAPP-Seq couldpotentially serve as a complementary noninvasive screening test.However, methodological improvements to further lower the detectionthreshold will be required to detect early stage tumors.

In conclusion, we have developed a flexible method for ultrasensitiveand specific assessment of circulating tumor DNA. CAPP-Seq overcomeslimitations of previously proposed methods for cfDNA analysis bysimultaneously measuring multiple types of mutations withoutpatient-specific optimization and by covering mutations in the majorityof patients. Moreover, due to multiplexing, CAPP-Seq is highlyeconomical, and per sample costs for plasma cfDNA are expected to dropfurther as NGS costs continue to fall. Our method has the potential toaccelerate the personalized detection, therapy, and monitoring of cancerpatients. We anticipate that CAPP-Seq will prove valuable in a varietyof clinical settings, including the assessment of cancer DNA inalternative biological fluids and specimens with low cancer cellcontent.

Methods Patient Selection

Between April 2010 and June 2012, patients undergoing treatment fornewly diagnosed or recurrent NSCLC were enrolled in a study approved bythe Stanford University Institutional Review Board. Enrolled patientshad not received blood transfusions within 3 months of blood collection.Patient characteristics are in Table 3.

Sample Collection and Processing

Peripheral blood from consented patients was collected in EDTAVacutainer tubes (BD). Blood samples were processed within 3 hours ofcollection. Plasma was separated by centrifugation at 2,500×g for 10min, transferred to microcentrifuge tubes, and centrifuged at 16,000×gfor 10 min to remove cell debris. The cell pellet from the initial spinwas used for isolation of germline genomic DNA from PBLs (peripheralblood leukocytes) with the DNeasy Blood & Tissue Kit (Qiagen). Matchedtumor DNA was isolated from FFPE specimens or from the cell pellet ofpleural effusions. Genomic DNA was quantified by Quant-iT PicoGreendsDNA Assay Kit (Invitrogen).

Cell-Free DNA Purification and Quantification

Cell-free DNA (cfDNA) was isolated from 1-5 mL plasma with the QIAampCirculating Nucleic Acid Kit (Qiagen). Absolute quantification ofpurified cfDNA was determined by quantitative PCR (qPCR) using an 81 bpamplicon on chromosome 1 (Fan et al. (2008) Proc. Natl Acad. Sci. USA105:16266-16271) and a dilution series of intact male human genomic DNA(Promega) as a standard curve. Power SyberGreen was used for qPCR on aHT7900 Real Time PCR machine (Applied Biosystems). Standard PCR thermalcycling parameters were used.

Illumina NGS Library Construction

Indexed Illumina NGS libraries were prepared from cfDNA and shorn tumor,germline, and cell line genomic DNA. For patient cfDNA, 7-32 ng DNA wasused for library construction without additional shearing orfragmentation. For tumor, germline, and cell line genomic DNA, 69-1000ng DNA was sheared prior to library construction with a Covaris S2instrument using the recommended settings for 200 bp fragments. SeeTable 2 for details.

The NGS libraries were constructed using the KAPA Library PreparationKit (Kapa Biosystems) employing a DNA Polymerase possessing strong 3′-5′exonuclease (or proofreading) activity and displaying the lowestpublished error rate (i.e. highest fidelity) of all commerciallyavailable B-family DNA polymerases (Quail et al. (2012) Nat. Methods9:10-11; Oyola et al. (2012) BMC Genomics 13:1). The manufacturer'sprotocol was modified to incorporate with-bead enzymatic and cleanupsteps (Fisher et al. (2011) Genome Biol. 12:R1). Briefly, following theend repair reaction, Agencourt AMPure XP beads (Beckman-Coulter) wereadded to bind and wash the DNA fragments. The DNA was then eluteddirectly into 50 μL 1× A-tailing buffer containing the A-tailing enzyme.Following the A-tailing reaction, the DNA fragments were forced to bindto the same AMPure XP beads by adding 90 μL (1.8×) of PEG buffer (20%PEG-8000 in 2.5M NaCl). After washing, the DNA was eluted into 50 μL 1×ligation buffer with ligase and 100-fold molar excess of indexedIllumina TruSeq adapters. Ligation was performed for 16 hours at 16° C.Single-step size selection was performed by adding 40 μL (0.8×) of PEGbuffer to enrich for ligated DNA fragments. The ligated fragments werethen amplified using 500 nM Illumina backbone oligonucleotides and avariable number of PCR cycles (between 4 and 9) depending on input DNAmass. In order to minimize bias and maximize recovery of GC-richtemplates, all PCR reactions were carried out in a BioRad DNA EngineThermal Cycler with a ramp rate of 2.2° C./sec or an Eppendorf VapoProtect Mastercycler with the Safe ramp rate setting.

Library purity and concentration was assessed by spectrophotometer(NanoDrop 2000) and qPCR (KAPA Biosystems), respectively. Fragmentlength was determined on a 2100 Bioanalyzer using the DNA 1000 Kit(Agilent).

Design of Library for Hybrid Selection

Custom hybrid selection was performed with the SeqCap EZ Choice Library,v2.0 (Roche NimbleGen). The custom SeqCap library was designed throughthe NimbleDesign portal (v1.2.R1) using genome build HG19 NCBI Build37.1/GRCh37 and with Maximum Close Matches set to 1. Input genomicregions were selected according to the most frequently mutated genes andexons in NSCLC. These regions were identified from the COSMIC database,TCGA, and other published sources as described in the DetailedMaterials. Final selector coordinates are provided in Table 1.

Hybrid Selection and High Throughput Sequencing

NimbleGen SeqCap EZ Choice was used according to the manufacturer'sprotocol with modifications. Between 9 and 12 indexed Illumina librarieswere included in a single capture reaction. Prior to hybrid selection,the libraries were quantified with a NanoDrop 2000 spectrophotometer,and 83-111 ng of each library was added (1 μg total DNA per capturereaction). Following hybrid selection, the captured DNA fragments wereamplified with 12-to-14 cycles of PCR using 1× KAPA HiFi Hot Start ReadyMix and 2 μM Illumina backbone oligonucleotides in 4-to-6 separate 50 μLreactions. The reactions were then pooled and processed with theQIAquick PCR Purification Kit (Qiagen). Multiplexed libraries weresequenced using 2×100 bp pared-end runs on an Illumina HiSeq 2000.

Mapping and Quality Control of NGS Data

Paired-end reads were mapped to the hg19 reference genome with BWA 0.6.2(default parameters) (Li & Durbin (2009) Bioinformatics 25:1754-1760),and sorted/indexed with SAMtools (Li et al. (2009) Bioinformatics25:2078-2079). QC was assessed using a custom Perl script to collect avariety of statistics, including mapping characteristics, read quality,and selector on-target rate (i.e., number of unique reads that intersectthe selector space divided by all aligned reads), generated respectivelyby SAMtools flagstat, FastQC(http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), andBEDTools coverageBed (Quinlan & Hall (2010) Bioinformatics 26:841-842).Importantly, we used a custom version of coverageBed modified to counteach read at most once. Plots of fragment length distribution andsequence depth/coverage were automatically generated for visual QCassessment. To mitigate the impact of sequencing errors, analyses notinvolving fusions were restricted to properly paired reads, andhigh-quality bases with a Phred quality score of at least 30 (≦0.1%probability of a sequencing error) were further analyzed.

Analysis of Detection Thresholds by CAPP-Seq

Two dilution series were performed to assess the linearity and accuracyof CAPP-Seq for quantitating tumor-derived cfDNA. In one experiment,shorn genomic DNA from a NSCLC cell line (HCC78) was spiked into cfDNAfrom a healthy individual, while in a second experiment, shorn genomicDNA from one NSCLC cell line (NCI-H3122) was spiked into shorn genomicDNA from a second NSCLC line (HCC78). A total of 32 ng DNA was used forlibrary construction. Following mapping and quality control, homozygousreporters were identified as alleles unique to each sample with at least20× sequencing depth at an allelic fraction >80%. Fourteen suchreporters were identified between HCC78 genomic DNA and plasma cfDNA(FIG. 9 (d), (e)), whereas 24 reporters were found between NCI-H3122 andHCC78 genomic DNA (FIG. 10).

CAPP-Seq Bioinformatics Pipeline

Details of bioinformatics methods are supplied in the Detailed Methods,and a graphical schematic is provided in FIG. 2. Briefly, for detectionof SNVs and indels, we employed VarScan 2 (Koboldt et al. (2012) GenomeRes. 22:568-576) with strict post-processing filters to improve variantcall confidence, and for fusion identification and breakpointcharacterization we used a novel algorithm, termed FACTERA (DetailedMethods). To quantify tumor burden in plasma cfDNA, allele frequenciesof reporter SNVs/indels were assessed using the output of SAMtoolsmpileup (Li et al. (2009) Bioinformatics 25:2078-2079), and fusions, ifdetected, were enumerated with FACTERA.

Statistical Analysis

The NSCLC selector was validated in silico using an independent cohortof lung adenocarcinomas (Imielinski et al. (2012) Cell 150:1107-1120)(FIG. 1( c)). To assess statistical significance, we analyzed the samecohort using 10,000 random selectors sampled from the exome, each withan identical size distribution to the CAPP-Seq NSCLC selector. Theperformance of random selectors had a Gaussian distribution, andp-values were calculated accordingly. Note that all identified somaticlesions were considered in this analysis.

We used Monte Carlo sampling to estimate the distribution of backgroundalleles across the NSCLC selector (FIG. 9 (a), (c); Detailed Methods).For each plasma sample, background alleles were defined as allelesremaining after exclusion of germline and/or somatic variant calls madeby VarScan 2 (Koboldt et al. (2012) Genome Res. 22:568-576) (somaticp-value=0.01; otherwise, default parameters), and with a Phred qualityscore ≧30. To evaluate the impact of reporter number on tumor burdenestimates, we also performed Monte Carlo sampling (1,000×), varying thenumber of reporters available {1, 2, . . . , max n} in two spikingexperiments (FIG. 9 (d)-(f); FIG. 10 (b)-(d)).

To assess the significance of tumor burden estimates in plasma cfDNA, wecompared patient-specific SNV frequencies against the null distributionof background SNVs across the selector. Briefly, patient-specificbackground was quantified using the method described for FIG. 9 (a)(Detailed Methods), but using the number of SNVs identified in thepatient's tumor. For patients with at least 1 SNV, but no other reportertypes, tumor-derived cfDNA was considered not detectable if mean SNVfractions fell below the 95^(th) percentile of background alleles (i.e.,P≧0.05) (FIG. 11 (a)). (Due to the ultra-low false detection rate forindels (Minoche et al. (2011) Genome Biol. 12:R112) and fusionbreakpoints, these mutation types were considered detected when presentwith >0 read support.) For patients with detectable disease in only 1time point, the corresponding empirical p-value is shown in FIG. 11 (a).To assess normality, we analyzed the patient with the most reporteralleles (i.e., P2; FIG. 11 (a)), and found that fractional abundancemeasurements fit a normal distribution (D'Agostino and Pearson omnibusnormality test). Thus, for patients with detectable tumor-derived cfDNAin two time points and with at least 3 cfDNA SNVs/indels, the change intumor burden was statistically assessed using a two-sided paired t-test.For P9, who lacked reporter SNVs/indels, statistical significance wasestimated by correlation of CAPP-Seq measurements with known tumorvolume (as measured by CT scans).

Additional details on cell lines, tumor cell sorting, optimizations oflibrary preparation, mutation/translocation validation, CAPP-Seq designand analytical pipelines including FACTERA translocation detection tool,and additional statistical methods are presented in the DetailedMethods.

Detailed Methods A. Molecular Biology Methods A1. Cell Lines

The lung adenocarcinoma cell lines NCI-H3122 and HCC78 were obtainedfrom ATCC and DSMZ, respectively, and grown in RPMI 1640 withL-glutamine (Gibco) supplemented with 10% fetal bovine serum (Gembio)and 1% penicillin/streptomycin cocktail. Cells were maintained inmid-log-phase growth in a 37° C. incubator with 5% CO₂. Genomic DNA waspurified from freshly harvested cells with the DNeasy Blood & Tissue Kit(Qiagen).

A2. Pleural Fluid Processing and Flow Cytometry, and Cell Sorting

Cells from pleural fluid from patients P9 and P6 were harvested bycentrifugation at 300×g for 5 min at 4° C. and washed in FACS stainingbuffer (HBSS+2% heat-inactivated calf serum [HICS]). Red blood cellswere lysed with ACK Lysing Buffer (Invitrogen), and clumps were removedby passing through a 100 μm nylon filter. Filtered cells were spun downand resuspended in staining buffer. While on ice, the cell suspensionwas blocked for 20 min with 10 μg/mL rat IgG and then stained for 20 minwith APC-conjugated mouse anti-human EpCAM (BioLegend, clone 9C4),PerCP-Cy5.5-conjugated mouse anti-human CD45 (eBioscience, clone 2D1),and PerCP-eFluor710-conjugated mouse anti-human CD31 (eBioscience, cloneWM59). After staining, cells were washed and resuspended with stainingbuffer containing 1 μg/mL DAPI, analyzed, and sorted with a FACSAria IIcell sorter (BD Biosciences). Cell doublets and DAPI-positive cells wereexcluded from analysis and sorting. CD31⁻CD45⁻EpCAM⁺ cells were sortedinto staining buffer, spun down, and flash frozen in liquid nitrogen.DNA was isolated with the QIAamp DNA Micro Kit (Qiagen).

A3. Optimization of NGS Library Preparation from Low Input cfDNA

Any method for detecting mutant cfDNA relies on its ability tointerrogate each cfDNA molecule in the circulation in order to maximizesensitivity. For this reason, we used the QIAamp Circulating NucleicAcid kit (Qiagen) with carrier RNA as per the manufacturer's protocol toisolate cfDNA. We also took specific steps to improve the Illuminalibrary preparation workflow.

Protocols for Illumina library construction were compared in a step-wisemanner with the goal of (1) optimizing adapter ligation efficiency, (2)reducing the necessary number of PCR cycles following adapter ligation,(3) preserving the naturally occurring size distribution of cfDNAfragments, and (4) minimizing variability in depth of sequencingcoverage across all captured genomic regions. Initial optimization wasdone with NEBNext DNA Library Prep Reagent Set for Illumina (New EnglandBioLabs), which includes reagents for end-repair of the cfDNA fragments,A-tailing, adapter ligation, and amplification of ligated fragments withPhusion High-Fidelity PCR Master Mix. Input was 4 ng cfDNA (obtainedfrom plasma of the same healthy volunteer) for all conditions. Relativeallelic abundance in the constructed libraries was assessed by qPCR of 4genomic loci (Roche NimbleGen: NSC-0237, NSC-0247, NSC-0268, andNSC-0272) and compared by the 2^(−ΔCt) method.

Ligations were performed at 20° C. for 15 min (as per the manufacturer'sprotocol), at 16° C. for 16 hours, or with temperature cycling for 16hours as previously described (Lund et al. (1996) Nucl. Acids Res.24:800-801). Ligation volumes were varied from the standard (50 μL) downto 10 μL while maintaining a constant concentration of DNA ligase, cfDNAfragments, and Illumina adapters. Subsequent optimizations incorporatedligation at 16° C. for 16 hours in 50 μL reaction volumes.

Next, we compared standard SPRI bead processing procedures, in which newAMPure XP beads are added after each enzymatic reaction and DNA iseluted from the beads for the next reaction, to with-bead protocolmodifications as previously described (Fisher, S. et al. (2011) GenomeBiol. 12:R1). We compared 2 concentrations of Illumina adapters in theligation reaction: 12 nM (10-fold molar excess to cfDNA fragments) and120 nM (100-fold molar excess).

Using the optimized library preparation procedures, we next compared theNEBNext DNA Library Prep Reagent Set (with Phusion DNA Polymerase) tothe KAPA Library Preparation Kit (with KAPA HiFi DNA Polymerase). TheKAPA Library Preparation Kit with our modifications was also compared tothe NuGEN SP Ovation Ultralow Library System with automation on MondrianSP Workstation.

A4. Evaluation of Library Preparation Modifications on CAPP-SeqPerformance

We performed CAPP-Seq on 32 ng cfDNA using standard library preparationprocedures with the NEBNext kit, or with optimized procedures usingeither the NEBNext kit or the KAPA Library Preparation Kit. In parallelwe performed CAPP-Seq on 4 ng and 128 ng cfDNA using the KAPA kit withour optimized procedures. Indexed libraries were constructed, and hybridselection was performed in multiplex. The post-capture multiplexedlibraries were amplified with Illumina backbone primers for 14 cycles ofPCR and then sequenced on a paired-end 100 bp lane of an Illumina HiSeq2000.

We also evaluated CAPP-Seq on ultralow input following whole genomeamplification (WGA). For WGA we chose not to use multiple displacementamplification with Φ29 DNA polymerase due given the small size of cfDNAfragments in plasma (FIG. 1( e)), and due to concern for chimeraformation, which would confound analysis of recurrent gene fusions inNSCLC by CAPP-Seq. Instead we used SeqPlex DNA Amplification Kit(Sigma-Aldrich), which employs degenerate oligonucleotide primer PCR. Weused the upper limit of input into this kit (1 ng) and performed wholegenome amplification according to the manufacturer's protocol. Briefly,1 ng cfDNA was amplified with real-time monitoring with SYBR Green I(Sigma-Aldrich) on a HT7900 Real Time PCR machine (Applied Biosystems).The amplification was terminated after 17 cycles yielding 2.8 μg DNA.The primer removal step yielded ˜600 ng DNA, and this entire amount wasused for library preparation using the NEBNext kit with optimizedprocedures as described above.

A5. Validation of Variants Detected by CAPP-Seq

All structural rearrangements and a subset of tumoral SNVs detected byCAPP-Seq were independently confirmed by qPCR and/or Sanger sequencingof amplified fragments. For HCC78, a 120 bp fragment containing theSLC34A2-ROS1 breakpoint was amplified from genomic DNA using theprimers: 5′-AGACGGGAGAAAATAGCACC-3′ and 5′-ACCAAGGGTTGCAGAAATCC-3′. A141 bp fragment containing exon 2 of U2AF1 was amplified using theprimers: 5′-CATGTGTTTGATATCTTCCCAGC-3′ and 5′-CTGGCTAAACGTCGGTTTATTG-3′.For NCI-H3122, a 143 bp fragment containing the EML4-ALK breakpoint wasamplified using the primers: 5′-GAGATGGAGTTTCACTCTTGTTGC-3′ and5′-GAACCTTTCCATCATACTTAGAAATAC-3′. 5 ng genomic DNA was used as templatewith 250 nM oligos and 1× Phusion PCR Master Mix (NEB) in 50 μLreactions. Products were resolved on 2.5% agarose gel and bands of theexpected size were removed. The amplified DNA fragments were purifiedusing the Qiaquick Gel Extraction Kit (Qiagen) and submitted for Sangersequencing (Elim Biopharm). For P9, genomic DNA breakpoints wereconfirmed by qPCR using the primers: 5′-TCCATGGAAGCCAGAAC-3′ and5′-ATGCTAAGATGTGTCTGTCA-3′ for EML4-ALK; 5′-CCTTAACACAGATGGCTCTTGATGC-3′and 5′-TCCTCTTTCCACCTTGGCTTTCC-3′ for ROS1-MKX; and5′-GGTTCAGAACTACCAATAACAAG-3′ and 5′-ACCTGATGTGTGACCTGATTGATG-3′ forFYN-ROS1. For qPCR, 10 ng of pre-amplified genomic DNA was used astemplate with 250 nM oligos and 1× Power SyberGreen Master Mix in 10 μLreactions performed in triplicate on a HT7900 Real Time PCR machine(Applied Biosystems). Standard PCR thermal cycling parameters were used.Amplification of amplicons spanning all 3 breakpoints detected in P9were confirmed in tumor genomic DNA as well as plasma cfDNA, and PBLgenomic DNA was used as a negative control. Separately, at least 88% ofSNVs and indels detected were bona fide somatic mutations in tumors, as38 of 46 of them were independently observed above 0.025% allelefrequency in plasma cfDNA and/or were independently confirmed bySNaPshot clinical assays.

B. Bioinformatics and Statistical Methods B1. Analysis of CAPP-SeqBackground

The CAPP-Seq background rate was estimated by Monte Carlo sampling ofallelic frequencies across the NSCLC selector (FIG. 9 (a)). Plasma cfDNAsamples were pre-filtered to remove all variant calls and dominantalleles. Specifically, for each patient, we excluded germline, loss ofheterozygosity (LOH), and/or somatic variant calls made by VarScan 2(Koboldt et al. (2012) Genome Res. 22:568-576) (somatic p-value=0.01;otherwise, default parameters). We sampled 4 random background allelesacross this subset of the selector (equal to the median number of SNVsper NSCLC patient detected by CAPP-Seq) and calculated their meanallelic frequency, only considering bases discordant with the prevailinggenotype of the plasma sample at those 4 positions. This process wasiterated 10,000 times, and mean, median, and 75^(th) percentilestatistics were collected. The entire procedure was then repeated for 5total simulations, shown in FIG. 2 a.

We likewise applied Monte Carlo simulation to estimate the probabilityof finding a background allele in plasma cfDNA at a given fractionalabundance (FIG. 9 (c)). For consistency with the ranking of alleles inFIG. 9 (c), we populated a vector containing the mean background allelefrequency for each genomic position across 7 plasma cfDNA samples, eachfiltered to remove dominant alleles as described above. Alleles wererandomly sampled from this vector 10,000 times to identify the allelefrequency with an empirical p-value of 0.01.

B2. ROS1 and U2AF1 Co-Association Analysis B2.1 Assembly of ROS1 andU2AF1 Mutant NSCLC

We included only cases in which the status of both ROS1 fusion statusand U2AF1 S34 mutation was known. There were 163 such cases from TCGA(genotyped for U2AF1 by whole exome sequencing and for ROS1 fusions byRNA-Seq as detailed below), 23 cases from Imielinski et al. (2012) Cell150:1107-1120, 17 cases from Govindan et al. (2012) Cell 150:1121-1134,and 13 cases from the present study (11 patients and 2 NSCLC celllines). U2AF1 S34F mutations were detected in 11 cases (5 from TCGA, 3from Imielinski et al., 1 from Govindan et al., and 2 from the presentstudy), and ROS1 fusions were detected in 6 cases (2 from TCGA,described below, and 4 from the present study). Significance testing wasperformed using the Fisher's exact test, and a two-tailed P-value isreported.

B2.2. Analysis of Whole Transcriptome Sequencing Data from TCGA for ROS1Fusions

We identified two TCGA lung adenocarcinoma patients, TCGA-05-4426 andTCGA-64-1680, harboring candidate ROS1 fusions (FIG. 16 (a))Importantly, the latter patient also has the U2AF1 S34F missensemutation reported in this study and in prior literature (see above). Tofurther analyze both patients' putative rearrangements, wholetranscriptome RNA-Seq data (.bam files) were obtained using the UCSCGeneTorrent system (https://cghub.ucsc.edu/downloads.html) and realignedto hg19 using BWA 0.6.2 using default parameters (Li & Durbin (2009)Bioinformatics 25:1754-1760) Importantly, mapped RNA-Seq reads extendedsignificantly past coding regions, allowing for improved assessment offusion events (FIG. 16 (b), (c)). From a manual inspection of associatedRPKM expression data across ROS1 exons (FIG. 16 (a)), we suspected thatbreakpoint sites for these fusions may lie directly upstream of ROS1exons 32 and 35, respectively. Using the Integrated Genome Viewer (IGV)(Robinson et al. (2011) Nat. Biotechnol. 29:24-26), we found improperlypaired (or discordant) reads near these exons that link ROS1 to itswell-described partners, SLC34A2 and CD74, respectively (FIG. 16 (b),(c)). Indeed, by applying FACTERA's templated fusion discovery (detailedbelow) to patient TCGA-64-1680, we recovered a single read near ROS1exon 35 that also maps to CD74 (FIG. 16 (c)). Collectively, these datastrongly support the existence of expressed ROS1 fusions in these twoTCGA patients.

B3. CAPP-Seq Selector Design

Most human cancers are relatively heterogeneous for somatic mutations inindividual genes. Specifically, in most human tumors, recurrent somaticalterations of single genes account for a minority of patients, and onlya minority of tumor types can be defined using a small number ofrecurrent mutations (<5-10) at predefined positions. Therefore, thedesign of the selector is vital to the CAPP-Seq method because (1) itdictates which mutations can be detected in with high probability for apatient with a given cancer, and (2) the selector size (in kb) directlyimpacts the cost and depth of sequence coverage. For example, the hybridselection libraries available in current whole exome capture kits rangefrom 51-71 Mb, providing ˜40-60 fold maximum theoretical enrichmentversus whole genome sequencing. The degree of potential enrichment isinversely proportional to the selector size such that for a ˜100 kbselector, >10,000 fold enrichment should be achievable.

We employed a six-phase design strategy to identify and prioritizegenomic regions for the CAPP-Seq NSCLC selector as detailed below. Threephases were used to incorporate known and suspected NSCLC driver genes,as well as genomic regions known to participate in clinically actionablefusions (phases 1, 5, 6), while another three phases employed analgorithmic approach to maximize both the number of patients covered andSNVs per patient (phases 2-4). The latter relied upon a metric that wetermed “Recurrence Index” (RI), defined as the number of NSCLC patientswith SNVs that occur within a given kilobase of exonic sequence (i.e.,No. of patients with mutations/exon length in kb). RI thus serves tomeasure patient-level recurrence frequency at the exon level, whilesimultaneously normalizing for gene/exon size. As a source of somaticmutation data uniformly genotyped across a large cohort of patients, inphases 2-4, we analyzed non-silent SNVs identified in TCGA whole exomesequencing data from 178 patients in the Lung Squamous Cell Carcinomadataset (SCC) (Hammerman et al. (2012) Nature 489:519-525) and from 229patients in the Lung Adenocarcinoma (LUAD) datasets (TCGA query date wasMar. 13, 2012). Thresholds for each metric (i.e. RI and patients perexon) were selected to statistically enrich for known/suspected driversin SCC and LUAD data (FIG. 9). RefSeq exon coordinates (hg19) wereobtained via the UCSC Table Browser (query date was Apr. 11, 2012).

The following algorithm was used to design the CAPP-Seq selector(parenthetical descriptions match design phases noted in FIG. 1 (b)).

Phase 1 (Known Drivers)

Initial seed genes were chosen based on their frequency of mutation inNSCLCs.

Analysis of COSMIC (v57) (Forbes et al. (2010) Nucl. Acids Res.38:D652-657) identified known driver genes that are recurrently mutatedin ≧9% of NSCLC (denominator ≧500 cases). Specific exons from thesegenes were selected based on the pattern of SNVs previously identifiedin NSCLC. The seed list also included single exons from genes withrecurrent mutations that occurred at low frequency but had strongevidence for being driver mutations, such as BRAF exon 15, which harborsV600E mutations in <2% of NSCLC (Ding et al. (2008) Nature455:1069-1075; Youn & Simon (2011) Bioinformatics 27:175-181; Okuda etal. (2008) Cancer Sci. 99:2280-2285; Su et al. (2011) J. Mol. Diagn.13:74-84; Tsao et al. (2007) J. Clin. Oncol. 25:5240-5247; Chaft et al.(2012) Mol. Cancer Ther. 11:485-491; Paik et al. (2011) J. Clin. Oncol.29:2046-2051; Stephens et al. (2004) Nature 431:525-526; Jin et al.(2010) Lung Cancer 69:279-283; Malanga et al. (2008) Cell Cycle7:665-669).

Phase 2 (Max. Coverage)

For each exon with SNVs covering ≧5 patients in LUAD and SCC, weselected the exon with highest RI that identified at least 1 new patientwhen compared to the prior phase. Among exons with equally high RI, weadded the exon with minimum overlap among patients already captured bythe selector. This was repeated until no further exons met thesecriteria.

Phase 3 (RI≧30)

For each remaining exon with an RI≧30 and with SNVs covering ≧3 patientsin LUAD and SCC, we identified the exon that would result in the largestreduction in patients with only 1 SNV. To break ties among equally bestexons, the exon with highest RI was chosen. This was repeated until noadditional exons satisfied these criteria.

Phase 4 (RI≧20)

Same procedure as phase 3, but using RI≧20.

Phase 5 (Predicted Drivers)

We included all exons from additional genes previously predicted toharbor driver mutations in NSCLC (Ding et al. (2008) Nature455:1069-1075; Youn & Simon (2011) Bioinformatics 27:175-181).

Phase 6 (Add Fusions)

For recurrent rearrangements in NSCLC involving the receptor tyrosinekinases ALK, ROS1, and RET, the introns most frequently implicated inthe fusion event and the flanking exons were included.

All exons included in the selector, along with their corresponding HUGOgene symbols and genomic coordinates, as well as patient statistics forNSCLC and a variety of other cancers, are provided in Table 1, organizedby selector design phase.

C. CAPP-Seq Computational Pipeline C1. Mutation Discovery: SNVs/Indels

For detection of somatic SNV and insertion/deletion events, we employedVarScan 2 (Koboldt et al. (2012) Genome Res 22:568-576) (somaticp-value=0.01, minimum variant frequency=5%, and otherwise defaultparameters). Somatic variant calls (SNV or indel) present at less than0.5% mutant allelic frequency in the paired normal sample (PBLs), but ina position with at least 1000× overall depth in PBLs and 100× depth inthe tumor, and with at least 1× read depth on each strand, were retained(Table 3). While the selector was designed to predominantly captureexons, in practice, it also captures limited sequence content flankingeach targeted region. For instance, this phenomenon is the basis for the(thus far) uniformly successful recovery by CAPP-Seq of fusion partners(which are not included within the selector) for kinase genes such asALK and ROS1 recurrently rearranged in NSCLC. As such, we alsoconsidered variant calls detected within 500 bps of defined selectorcoordinates. These calls were eliminated if present in non-coding repeatregions, since repeats may confound mapping accuracy. Repeat sequencecoordinates were obtained using the RepeatMasker track in the UCSC tablebrowser (hg19). Variant annotation was performed using the SeattleSeqAnnotation 137 web server(http://snp.gs.washington.edu/SeattleSeqAnnotation137/). Completedetails for all identified SNVs and indels are provided in Table 2.

By manual inspection, two patients (P2 and P6) had SNVs with frequenciesconsistent with potential heterozygous and homozygous alleles. Welabeled these alleles accordingly (Table 3), and based on our assumptionof zygosity in these two patients, we adjusted measured fractions ofheterozygous reporters in plasma cfDNA to better estimate tumor burden(Table 4).

C2. Mutation Discovery: Fusions

For practical and robust de novo enumeration of genomic fusion eventsand breakpoints from paired-end next-generation sequencing data, wedeveloped a novel heuristic approach, termed FACTERA (FACileTranslocation Enumeration and Recovery Algorithm). FACTERA has minimalexternal dependencies, works directly on a preexisting .bam alignmentfile, and produces easily interpretable output. Major steps of thealgorithm are summarized below, and are complemented by a graphicalschematic to illustrate key elements of the breakpoint identificationprocess (FIG. 4).

As input, FACTERA requires a .bam alignment file of paired-end readsproduced by BWA (Li & Durbin (2009) Bioinformatics 25:1754-1760), exoncoordinates in .bed format (e.g., hg19 RefSeq coordinates), and a 0.2bit reference genome to enable fast sequence retrieval (e.g., hg19). Inaddition, the analysis can be optionally restricted to reads thatoverlap particular genomic regions (.bed file), such as the CAPP-Seqselector used in this work.

FACTERA processes the input in three sequential phases: identificationof discordant reads, detection of breakpoints at base pair-resolution,and in silico validation of candidate fusions. Each phase is describedin detail below.

C2.1. Identification of Discordant Reads

To iteratively reduce the sequence space for gene fusion identification,FACTERA, like other algorithms (e.g. BreakDancer (Chen et al. (2009)Nat. Methods 6:677-681)), identifies and classifies discordant readpairs. Such reads indicate a nearby fusion event since they either mapto different chromosomes or are separated by an unexpectedly largeinsert size (i.e. total fragment length), as determined by the BWAmapping algorithm. The bitwise flag accompanying each aligned readencodes a variety of mapping characteristics (e.g., improperly paired,unmapped, wrong orientation, etc.) and is leveraged to rapidly filterthe input for discordant pairs. The closest exon of each discordant readis subsequently identified, and used to cluster discordant pairs intodistinct gene-gene groups, yielding a list of genomic regions R adjacentto candidate fusion sites. For each member gene of a discordant genepair, the genomic region R_(i) is defined by taking the minimum of all3′ exon/read coordinates in the cluster, and the maximum of all 5′exon/read coordinates in the cluster. These regions are used toprioritize the search for breakpoints in the next phase (FIG. 4 (a)).

C2.2 Detection of Breakpoints at Base Pair-Resolution

Discordant read pairs may be introduced by NGS library preparationand/or sequencing artifacts (e.g., jumping PCR). However, they are alsolikely to flank the breakpoints of bona fide fusion events. As such, alldiscordant gene pairs identified in the preceding of one read matchesthe soft-clipped region of the other, FACTERA records a putative fusionevent. To assess inter-read concordance (e.g. see reads 1 and 2 in FIG.4 (c)), FACTERA employs the following algorithm. The mapped region ofread 1 is parsed into all possible subsequences of length k (i.e.,k-mers) using a sliding window (k=10, by default). Each k-mer, alongwith its lowest sequence index in read 1, is stored in a hash table datastructure, allowing k-mer membership to be assessed in constant time(FIG. 4 (c), left panel). Subsequently, the soft clipped sequence ofread 2 is parsed into non-overlapping subsequences of length k, and thehash table is interrogated for matching k-mers (FIG. 4 (c), rightpanel). If a minimum matching threshold is achieved (=0.5×the minimumlength of the two compared subsequences), then the two reads areconsidered concordant. FACTERA will process at most 1000 (by default)putative breakpoint pairs for each discordant gene pair. Moreover, foreach gene pair, FACTERA will only compare reads whose orientations arecompatible with valid fusions. Such reads have soft-clipped sequencesfacing opposite directions (FIG. 4 (d), top panel). When this conditionis not satisfied, FACTERA uses the reverse complement of read 1 fork-mer analysis (FIG. 4 (d), bottom panel).

In some instances, genomic subsequences flanking the true breakpoint maybe nearly or completely identical, causing the aligned portions ofsoft-clipped reads to overlap. Unfortunately, this prevents anunambiguous determination of the breakpoint. As such, FACTERAincorporates a simple algorithm to arbitrarily adjust the breakpoint inone read (i.e., read 2) to match the other (i.e., read 1). Dependingupon read orientation, there are two ways this can occur, both of whichare illustrated in FIG. 4 (e). For each read, FACTERA calculates thedistance between the breakpoint and the read coordinate corresponding tothe first k-mer match between reads. For example, as anecdotallyillustrated in FIG. 4 (e), x is defined as the distance between thebreakpoint coordinate of read 1 and the index of the first matchingk-mer, j, whereas y denotes the corresponding distance for read 2. Theoffset is estimated as the difference in distances (x, y) between thetwo reads (see FIG. 4 (e)).

C2.3. In Silico Validation of Candidate Fusions

To confirm each candidate breakpoint in silico, FACTERA performs a localrealignment of reads against a template fusion sequence (±500 bp aroundthe putative breakpoint) extracted from the 0.2 bit reference genome.BLAST is currently employed for this purpose, although BLAT or otherfast aligners could be substituted. A BLAST database is constructed bycollecting all reads that map to each candidate fusion sequence,including discordant reads and soft-clipped reads, as well as allunmapped reads in the original input .bam file. All reads that map to agiven fusion candidate with at least 95% identity and a minimum lengthof 90% of the input read length (by default) are retained, and readsthat span or flank the breakpoint are counted. As a final step, outputredundancies are minimized by removing fusion sequences within a 20 bpinterval of any fusion sequence with greater read support and with thesame sequence orientation (to avoid removing reciprocal fusions).

FACTERA produces a simple output text file, which includes for eachfusion sequence, the gene pair, the chromosomal sequence coordinates ofthe breakpoint, the fusion orientation (e.g., forward-forward orforward-reverse), the genomic sequences within 50 bp of the breakpoint,and depth statistics for reads spanning and flanking the breakpoint.Fusions identified in patients analyzed in this work are provided inTable 3.

C2.4. Experimental Validation of FACTERA

To experimentally evaluate the performance of FACTERA, we generated NGSdata from two NSCLC cell lines, HCC78 (21.5M×100 bp paired-end reads)and NCI-H3122 (19.4M×100 bp paired-end reads), each of which has a knownrearrangement (ROS1 and ALK, respectively) (Bergethon et al. (2012) J.Clin. Oncol. 30:863-870; McDermott et al. (2008) Cancer Res.68:3389-3395) with a breakpoint that has, to the best of our knowledge,not been previously published. FACTERA readily revealed evidence for areciprocal SLC34A2-ROS1 translocation in the former and an EML4-ALKfusion in the latter. Precise breakpoints predicted by FACTERA wereexperimentally validated by PCR amplification and Sanger sequencing(FIG. 5; see also Validation of Variants Detected by CAPP-Seq).Importantly, FACTERA completed each run in practical time (˜90 sec),using only a single thread on a hexa-core 3.4 GHz Intel Xeon E5690 chip.These initial results illustrate the utility of FACTERA as part of theCAPP-Seq analysis pipeline.

C2.5. Templated Fusion Discovery

We implemented a user-directed option to “hunt” for fusions withinexpected candidate genes. A fusion could be missed by FACTERA if thefusion detection criteria employed by FACTERA are incompletelysatisfied—such as if discordant reads, but not soft-clipped reads, areidentified—and will most likely occur when fusion allele frequency inthe tumor is extremely low. As input, the method is supplied withcandidate fusion gene sequences as “baits”. All unmapped andsoft-clipped reads in the input .bam file are subsequently aligned tothese templates (using blastn) to identify reads that have sufficientsimilarity to both (for each read, 95% identity, e-value <1.0e-5, and atleast 30% of the read length must map to the template, by default). Suchreads are output as a list to the user for manual analysis.

We tested this simple approach on a low purity tumor sample found toharbor an ALK fusion by FISH, but not FACTERA (i.e., case P9). Usingtemplates for ALK and its common fusion partner, ELM4, we identified 4reads that mapped to both, in a region with an overall depth of ˜1900×.The estimated allele frequency of 0.21% is strikingly similar to the0.22% tumor purity measured by FACS (FIG. 15), confirming the utility ofthe templated fusion discovery method. We subsequently FACS-depletedCD45+ immune populations and re-sequenced this patient's tumor. In theenriched tumor sample, FACTERA identified the EML4-ALK fusion, alongwith two novel ROS1 fusions (FIG. 4 (e), Table 3).

C3. Mutation Recovery: SNVs/Indels

Using a custom Perl script, previously identified reporter alleles wereintersected with a SAMtools mpileup file generated for each plasma cfDNAsample, and the number and frequency of supporting reads was calculatedfor each reporter allele. Only reporters in properly paired reads atpositions with at least 500× overall depth were considered.

C4. Mutation Recovery: Fusions

For enumeration of fusion frequency in sequenced plasma DNA, FACTERAexecutes the last step of the discovery phase (i.e., in silicovalidation of candidate fusions, above) using the set of previouslyidentified fusion templates. The fusion allele frequency is calculatedas α/β, where α is the number of breakpoint-spanning reads, and β is themean overall depth within a genomic region ±5 bps around the breakpoint.Regarding the NSCLC selector described in this work, the lattercalculation was always performed on the single gene contained in theNSCLC selector library. If both fusion genes are targeted within aselector library, overall depth is estimated by taking the mean depthcalculated for both genes.

Notably, in some cases we observed lower fusion allele frequencies thanwould be expected for heterozygous alleles (e.g., see cell line fusionsin Table 3). This was seen in cell lines, in an empirical spikingexperiment, and in one patient's tumor and plasma samples (i.e., P6),and could potentially result from inefficient “pull-down” of fusionswhose partners are not represented in the selector. Regardless, fusionsare useful reporters—they possess virtually no background signal andshow linear behavior over defined concentrations in a spiking experiment(FIG. 10 (d)). Moreover, allelic frequencies in plasma are easilyadjusted for such inefficiencies by dividing the measured frequency inplasma by the corresponding frequency in the tumor. In cases wheresequenced tumor tissue is impure, tumor content can be estimated usingthe frequencies of SNVs (or indels) as a reference frame, allowing thefusion fraction to be normalized accordingly (Table 4). As forSNVs/indels, only fusions present in at least one plasma sample wereincluded in calculations of tumor burden.

C5. Screening Plasma cfDNA without Knowledge of Tumor DNA

We devised the following statistical algorithm as a first step towardnon-invasive cancer screening with plasma cfDNA. The method identifiescandidate SNVs using iterative models of (i) background noise in pairedgermline DNA (in this work, PBLs), (ii) base-pair resolution backgroundfrequencies in plasma cfDNA across the selector, and (iii) sequencingerror in cfDNA. Anecdotal examples are provided in FIG. 17. Thealgorithm works in four main steps, detailed below.

As input, the algorithm takes allele frequencies from a single plasmacfDNA sample and analyzes high quality background alleles, defined in afirst step for each genomic position as the non-dominant base withhighest fractional abundance. Only alleles with depth of at least 500×and strand bias <90% (conservative, by default) are analyzed. Forconsistency with variant calling, we allowed the screening approach tointerrogate selector regions within 500 bp of defined coordinates,expanding the effective sequence space from ˜125 kb to ˜600 kb.

Second, the binomial distribution is used to test whether a given inputcfDNA allele is significantly different from the corresponding pairedgermline allele (FIG. 17 (a)-(b)). Here the probability of success istaken to be the frequency of the background allele in PBLs, and thenumber of trials is the allele's corresponding depth in plasma cfDNA. Toavoid contributions from alleles in rare circulating tumor cells thatmight contaminate PBLs, input alleles with a fractional abundancegreater than 0.5% in paired PBLs (by default) or a Bonferroni-adjustedbinomial probability greater than 2.08×10⁻⁸ are not further considered(alpha of 0.05/[˜600 kb*4 alleles per position]).

Third, a database of cfDNA background allele frequencies is assembled.Here, we used samples analyzed in the present study (i.e., pre-treatmentNSCLC samples and 1 sample from a healthy volunteer), except the inputsample is left out to avoid bias. Based on the assumption that allbackground allele fractions follow a normal distribution, a Z-test isemployed to test whether a given input allele differs significantly fromtypical cfDNA background at the same position (FIG. 17 (a)-(b)). Allalleles within the selector are evaluated, and those with an averagebackground frequency of 5% or greater (by default) or aBonferroni-adjusted single-tailed Z-score <5.6 are not furtherconsidered (alpha of 0.05, adjusted as above).

Finally, candidate alleles are tested for remaining possible sequencingerrors. This step leverages the observation that non-tumor variants(i.e., “errors”) in plasma cfDNA tend to have a higher duplication ratethan bona fide variants detectable in the patient's tumor (data notshown). As such, the number of supporting reads is compared for eachinput allele between nondeduped (all fragments meeting QC criteria; seeMethods) and deduped data (only unique fragments meeting QC criteria).An outlier analysis is then used to distinguish candidate tumor-derivedSNVs from remaining background noise (FIG. 17 (a)-(c)). Specifically, toreveal outlier tendency in the data, the square root of the robustdistance Rd (Mahalanobis distance) is compared against the square rootof the quantiles of a chi-squared distribution Cs. This transformationreveals natural separation between true SNVs and false positives incancer patients (FIG. 17 (a), (c)), and notably, reveals an absence ofoutlier structure in patient samples lacking tumor-derived SNVs (FIG. 17(b), (c)). To automatically call SNVs without prior knowledge, thescreening approach iterates through data points by decreasing Rb andrecalculating the Pearson's correlation coefficient Rho between Rd andCs for points 1 to i, where Rd_(i) is the current maximum Rd. Thealgorithm iteratively reports outliers (i.e., candidate SNVs) until itterminates when Rho≧0.85.

Importantly, this approach positively identified 60% of the cancersamples with tumor-derived SNVs analyzed in this study with no falsepositive calls (FIG. 11 (g)). When corresponding germline DNA from PBLsare unavailable, one can skip the 2^(nd) step in this screening routine.After removal of germline SNVs with an allelic fraction >20%, thismodified approach identified no SNVs when applied to a healthyvolunteer.

All patents, patent publications, and other published referencesmentioned herein are hereby incorporated by reference in theirentireties as if each had been individually and specificallyincorporated by reference herein.

While specific examples have been provided, the above description isillustrative and not restrictive. Any one or more of the features of thepreviously described embodiments can be combined in any manner with oneor more features of any other embodiments in the present invention.Furthermore, many variations of the invention will become apparent tothose skilled in the art upon review of the specification. The scope ofthe invention should, therefore, be determined by reference to theappended claims, along with their full scope of equivalents.

What is claimed is:
 1. A method for creating a library of recurrentlymutated genomic regions comprising: identifying a plurality of genomicregions from a group of genomic regions that are recurrently mutated ina specific cancer; wherein the library comprises the plurality ofgenomic regions; the plurality of genomic regions comprises at least 10different genomic regions; and at least one mutation within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer.
 2. The method of claim 1, wherein theplurality of genomic regions comprises at least 25, at least 50, atleast 100, at least 150, at least 200, or at least 500 different genomicregions.
 3. The method of claim 1, wherein at least two mutations withinthe plurality of genomic regions or at least three mutations within theplurality of genomic regions is present in at least 60% of all subjectswith the specific cancer.
 4. The method of claim 1, wherein at least onemutation within the plurality of genomic regions is present in at least60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% of all subjects with thespecific cancer.
 5. The method of claim 1, wherein the identifying stepcomprises for each genomic region in the plurality of genomic regions,ranking the genomic region to maximize the number of all subjects withthe specific cancer having at least one mutation within the genomicregion.
 6. The method of claim 1, wherein the identifying step comprisesfor each genomic region in the plurality of genomic regions, ranking thegenomic region to maximize the ratio between the number of all subjectswith the specific cancer having at least one mutation within the genomicregion and the length of the genomic region.
 7. The method of claim 1,wherein the library comprises a plurality of genomic regions encoding aplurality of driver sequences.
 8. The method of claim 7, wherein thedriver sequences are known driver sequences.
 9. The method of claim 7,wherein the driver sequences are recurrently mutated in the specificcancer.
 10. The method of claim 1, wherein the library comprises aplurality of genomic regions that are recurrently rearranged in thespecific cancer.
 11. The method of claim 1, wherein the specific canceris a carcinoma.
 12. The method of claim 11, wherein the carcinoma is anadenocarcinoma, a non-small cell lung cancer, or a squamous cellcarcinoma.
 13. The method of claim 1, wherein the cumulative length ofthe plurality of genomic regions is at most 30 Mb, 20 Mb, 10 Mb, 5 Mb, 2Mb, 1 Mb, 500 kb, 200 kb, 100 kb, 50 kb, 20 kb, or 10 kb.
 14. A methodfor analyzing a cancer-specific genetic alteration in a subjectcomprising the steps of: obtaining a tumor nucleic acid sample and agenomic nucleic acid sample from a subject with a specific cancer;sequencing a plurality of target regions in the tumor nucleic acidsample and in the genomic nucleic acid sample to obtain a plurality oftumor nucleic acid sequences and a plurality of genomic nucleic acidsequences; and comparing the plurality of tumor nucleic acid sequencesto the plurality of genomic nucleic acid sequences to identify apatient-specific genetic alteration in the tumor nucleic acid sample;wherein the plurality of target regions are selected from a plurality ofgenomic regions that are recurrently mutated in the specific cancer; theplurality of genomic regions comprises at least 10 different genomicregions; and at least one mutation within the plurality of genomicregions is present in at least 60% of all subjects with the specificcancer.
 15. The method of claim 14, wherein the plurality of genomicregions comprises at least 25, at least 50, at least 100, at least 150,at least 200, or at least 500 different genomic regions.
 16. The methodof claim 14, wherein at least two mutations within the plurality ofgenomic regions or at least three mutations within the plurality ofgenomic regions is present in at least 60% of all subjects with thespecific cancer.
 17. The method of claim 14, wherein at least onemutation within the plurality of genomic regions is present in at least60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% of all subjects with thespecific cancer.
 18. The method of claim 14, wherein each genomic regionin the plurality of genomic regions is identified by ranking the genomicregion to maximize the number of all subjects with the specific cancerhaving at least one mutation within the genomic region.
 19. The methodof claim 14, wherein each genomic region in the plurality of genomicregions is identified by ranking the genomic region to maximize theratio between the number of all subjects with the specific cancer havingat least one mutation within the genomic region and the length of thegenomic region.
 20. The method of claim 14, wherein the plurality ofgenomic regions comprises genomic regions encoding a plurality of driversequences.
 21. The method of claim 20, wherein the driver sequences areknown driver sequences.
 22. The method of claim 20, wherein the driversequences are recurrently mutated in the specific cancer.
 23. The methodof claim 14, wherein the plurality of genomic regions comprises genomicregions that are recurrently rearranged in the specific cancer.
 24. Themethod of claim 14, wherein the specific cancer is a carcinoma.
 25. Themethod of claim 24, wherein the carcinoma is an adenocarcinoma, anon-small cell lung cancer, or a squamous cell carcinoma.
 26. The methodof claim 14, wherein the cumulative length of the plurality of genomicregions is at most 30 Mb, 20 Mb, 10 Mb, 5 Mb, 2 Mb, 1 Mb, 500 kb, 200kb, 100 kb, 50 kb, 20 kb, or 10 kb.
 27. The method of any one of claims14-26, further comprising the steps of: obtaining a cell-free nucleicacid sample from the subject; and identifying the patient-specificgenetic alteration in the cell-free nucleic acid sample.
 28. The methodof claim 27, wherein the step of identifying the patient-specificgenetic alteration in the cell-free nucleic acid sample comprisessequencing a genomic region comprising the patient-specific geneticalteration in the cell-free sample.
 29. The method of claim 27, whereinthe step of obtaining a tumor nucleic acid sample and a genomic nucleicacid sample comprises the step of enriching the plurality of targetregions in the tumor nucleic acid sample and the genomic nucleic acidsample.
 30. The method of claim 29, wherein the enriching step comprisesuse of a custom library of biotinylated DNA.
 31. The method of claim 27,wherein the step of obtaining a cell-free nucleic acid sample comprisesthe step of enriching the plurality of target regions in the cell-freenucleic acid sample.
 32. The method of claim 27, further comprising thestep of quantifying the cancer-specific genetic alteration in thecell-free sample.
 33. A method for screening a cancer-specific geneticalteration in a subject comprising the steps of: obtaining a cell-freenucleic acid sample from a subject; sequencing a plurality of targetregions in the cell-free sample to obtain a plurality of cell-freenucleic acid sequences; and identifying a cancer-specific geneticalteration in the cell-free sample; wherein the plurality of targetregions are selected from a plurality of genomic regions that arerecurrently mutated in the specific cancer; the plurality of genomicregions comprises at least 10 different genomic regions; and at leastone mutation within the plurality of genomic regions is present in atleast 60% of all subjects with the specific cancer.
 34. The method ofclaim 33, wherein the plurality of genomic regions comprises at least25, at least 50, at least 100, at least 150, at least 200, or at least500 different genomic regions.
 35. The method of claim 33, wherein atleast two mutations within the plurality of genomic regions or at leastthree mutations within the plurality of genomic regions is present in atleast 60% of all subjects with the specific cancer.
 36. The method ofclaim 33, wherein at least one mutation within the plurality of genomicregions is present in at least 60%, 70%, 80%, 90%, 95%, 98%, 99%, or99.9% of all subjects with the specific cancer.
 37. The method of claim33, wherein each genomic region in the plurality of genomic regions isidentified by ranking the genomic region to maximize the number of allsubjects with the specific cancer having at least one mutation withinthe genomic region.
 38. The method of claim 33, wherein each genomicregion in the plurality of genomic regions is identified by ranking thegenomic region to maximize the ratio between the number of all subjectswith the specific cancer having at least one mutation within the genomicregion and the length of the genomic region.
 39. The method of claim 33,wherein the plurality of genomic regions comprises genomic regionsencoding a plurality of driver sequences.
 40. The method of claim 39,wherein the driver sequences are known driver sequences.
 41. The methodof claim 39, wherein the driver sequences are recurrently mutated in thespecific cancer.
 42. The method of claim 33, wherein the plurality ofgenomic regions comprises genomic regions that are recurrentlyrearranged in the specific cancer.
 43. The method of claim 33, whereinthe specific cancer is a carcinoma.
 44. The method of claim 43, whereinthe carcinoma is an adenocarcinoma, a non-small cell lung cancer, or asquamous cell carcinoma.
 45. The method of claim 33, wherein thecumulative length of the plurality of genomic regions is at most 30 Mb,20 Mb, 10 Mb, 5 Mb, 2 Mb, 1 Mb, 500 kb, 200 kb, 100 kb, 50 kb, 20 kb, or10 kb.
 46. The method of claim 33, wherein the step of obtaining acell-free nucleic acid sample comprises the step of enriching theplurality of target regions in the cell-free nucleic acid sample. 47.The method of claim 46, wherein the enriching step comprises use of acustom library of biotinylated DNA.